Cochrane Review: 'Exercise therapy for chronic fatigue syndrome', Larun et al. - New version October 2019

Discussion in 'Psychosomatic research - ME/CFS and Long Covid' started by MEMarge, Oct 2, 2019.

  1. Barry

    Barry Senior Member (Voting Rights)

    Messages:
    8,420
    Indirectness of evidence. Yes, this makes excellent sense to me. Investigators using evidence n-levels removed from directly-relevant evidence, are in a way engaging in confirmation bias when they seek to use it as if it is direct evidence. I think it would be good to have some measure of how many levels removed it is, I imagine relevance might drop off geometrically with increasing level. So testing on animals might be a level, use of a proxy might be a level, so using a proxy when testing on animals might be two levels (NOTE: Just airing ideas here, not qualified to be anything more than that). My gut feeling is that one level would increase the uncertainly significantly, two levels would introduce major uncertainty, and three levels blow it all out of the water. But just thinking off the top of my head really. And I've no idea if each level would necessarily carry the same weighting, might depend on the type of indirection.
     
    alktipping likes this.
  2. Snow Leopard

    Snow Leopard Senior Member (Voting Rights)

    Messages:
    3,860
    Location:
    Australia
    Wouldn't a SMD of 0.44 be a difference of slightly less than 2.3 points on the Chalder Scale?

    Thanks, Cochrane blocks VPNs and Sci-hub just links to Cochrane.
     
    alktipping likes this.
  3. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,001
    Location:
    Belgium
    Yes, but the 0.44 figure is when you take out the outlier of Powell et al. (2001), so it's relevant to heterogeneity and inconsistency.

    Regarding imprecision, I calculated the value for the lower bound of the confidence interval wich was an SMD of 0.31.

    The calculation is quite easy. They have taken the standard deviation at baseline in an observational study by Crawley et al., which was 5.2. So if you multiply the SMD with 5.2, you get the point difference on the 33-point Chalder Fatigue Scale.

    5.2 x 0.66 = 3.4

    5.2 x 0.44 = 2.3

    5.2 x 0.31 = 1.6
     
  4. Kalliope

    Kalliope Senior Member (Voting Rights)

    Messages:
    6,570
    Location:
    Norway
    #MEAction: Cochrane Analysis: What's Here, What's Missing, Conclusions
    by Jamie S

    To believe that increased exercise is an effective therapy worth testing in a clinical trial, researchers and clinicians must believe that patients’ symptoms are either incorrect, imagined, or immaterial. This de facto leads to theories that the patient doesn’t know what’s best for him or her, even when it comes to the most basic self-interrogation and self-care. Deconditioning, “fear of movement”, and central sensitization as explanations for exercise’s potential success are all built on the foundation of this disbelief and dismissal.

    It is no wonder that an analysis of studies built on this foundation will showcase not only a very narrow range of the world’s ME research, but highlight some of the most dismissive, belief-based, and biased work in the field.

    Why perform a Cochrane review on exercise therapies in ME at all? Perhaps because it’s the only treatment with significant research to support it, no matter how poor.
     
    MEMarge, rainy, alktipping and 17 others like this.
  5. hinterland

    hinterland Senior Member (Voting Rights)

    Messages:
    343
    Yeah, I definitely agree it's a confusing questionnaire. I've not read the full thread, so I'm just answering in a general context...

    Some while ago I attended one of the NHS fatigue services' 10 week CFS management courses, or whatever they were called. It involved a group therapy CBT-type approach, and, needless to say, wasn't very helpful other than to meet other people with ME, or just to supply very basic information for the uninitiated.

    Before starting the course sessions I had to complete a bunch of questionnaires including Chalder Fatigue Scale, with no guidance given on how to interpret the questions. I seem to remember it being a bit ambiguous at the time but did the best I could, and where it said 'do you have problems with tiredness?... more than usual, etc' I compared my level of health with how I was before getting ME, so of course I was worse. However, when asked to complete the questionnaires again, after the course, we were given specific guidance on how to do so, and told to compare how we were now with how we were at the start of the course, 10 weeks ago. I distinctly remember this seemed to have the effect of artificially inflating our level of health, to make it appear the therapy had a positive impact, when, in fact, it had done nothing at all. I was just as unwell as at the start of the treatment but now had to mark my answers 'no more than usual', as I'd been instructed on a new reference point. So I was a bit miffed when my before and after scores were sent to my GP and apparently I'd miraculously made an improvement! What alchemy.

    You can have a go yourselves, here: behold the Chalder fatigue scale

     
    Cheshire, MEMarge, Hutan and 29 others like this.
  6. Trish

    Trish Moderator Staff Member

    Messages:
    55,414
    Location:
    UK
    That's fascinating, @hinterland. Sounds like outright fraud to me. Or at the very least, deliberate misrepresentation.

    Something for our NICE guidelines patient representatives to be aware of.
     
    Cheshire, MEMarge, Hutan and 15 others like this.
  7. Simon M

    Simon M Senior Member (Voting Rights)

    Messages:
    995
    Location:
    UK
    More brilliant analysis, @Michiel Tack ,thank you

    Using Cochrane fatigue effect size, PACE trial CBT was ineffective.

    I think this is very important because it undermines the PACE trial claim of clinical effectiveness for CBT. They chose the widely used criterion for a clinically useful difference of an effect size of 0.5 SD. They used the trial's baseline SD (presumably pooled across all treatment arms), c3.7, and this SD is artificially constrained because the Chalder fatigue scale score was used as an entry criterion.

    Using the less constrained SD from an observational trial makes more sense, as Cochrane did here, which gives an effect size for PACE of only 0.33 – below the 0.5 PACE trial threshold for clinically useful difference.

    Note that for PACE, CBT didn't make a clinically useful difference to self-reported physical function, even in the reported results.

    So using the Cochrane effect size for fatigue, CBT in PACE failed to make a clinically useful difference to either self-reported physical function of fatigue.
     
    Last edited: Oct 11, 2019
    alktipping, Annamaria, Sean and 6 others like this.
  8. Barry

    Barry Senior Member (Voting Rights)

    Messages:
    8,420
    Yes @JaimeS. This is what my (somewhat rambling) post here was getting at.

    As we know, PACE a la GET was in no way seeking to trial a hypothesis of deconditioning; it was done on the assumption that the deconditioning theory for pwME was already established fact, already proven. They hypothesised that, on that basis, GET would fix the problem. But the deconditioning theory is totally blown, so they predicated their hypothesis on something having no basis on fact. Given the hypothesis for PACE a la GET was based on nothing but whim and fallacy, PACE should have no place in the scientific literature - it is a self-delusional fallacy of the authors.

    ETA: Especially when assumptions about harms from GET get generalised from PACE out to the whole ME population, and embedded within that is the notion that people who's root problem is deconditioning, cannot be harmed by GET.
     
    Last edited: Oct 11, 2019
  9. Barry

    Barry Senior Member (Voting Rights)

    Messages:
    8,420
    Yes.

    How are you now, compared to before you got ME? "I feel terrible."

    ... some weeks later ...

    How are you now, compared to when you last filled in this questionnaire? "About the same."

    Wow! Fantastic! So glad to have been able to help you!
     
    Hutan, alktipping, Annamaria and 16 others like this.
  10. ME/CFS Skeptic

    ME/CFS Skeptic Senior Member (Voting Rights)

    Messages:
    4,001
    Location:
    Belgium
    I'm not following. I don't think Cochrane used the 5.2 SD from Crawley et al. in their effect size calculation. I think they only used it to recalculate those effect sizes to points on the Chalder Fatigue Scale.

    The PACE trial gave a mean difference of -3.4 for CBT compared to SMC, on the 33-point Chalder Fatigue Scale. And the PACE authors defined a clinically useful difference as half a standard deviation and took the standard deviation in their sample. I'm not 100% sure that this sample was distorted because of the entry criteria: perhaps they used data from CFS patients who weren't selected in the trial as well? I've estimated the SD for the Chalder Fatigue Scale at baseline in the PACE trialists at 3.77. So if we take half of that we get 1.88, a bit less than the 2 they have used in the trial. Both are less than the 3.4 MD.

    If we use the 5.2 as standard deviation, from the observational study by Crawley et al. and take half of that, we get a clinically useful difference of 2.6. The
    minimal important difference Larun et al. used from the Lupus study, was 2.3. Both are less than the 3.4 point difference for CBT. Only when we use the threshold based on the clinical intuition of Ridsdale et al. (less than 4 points is not clinically significant) do we get a threshold that is higher than the 3.4 difference. GET had a mean difference of 3.2 points compared to SMC, so all the comparisons are pretty much the same.

    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    I think the reasoning does apply for physical function. In the PACE trial, the pooled SD at baseline was 15,74. So if we take half of that we get a clinically significant difference of 7.8. The PACE trial authors used the figure of 8. But physical function was used as in the entry criteria (first the threshold was a minimum of 60 and after 11 months they changed it to a minimum of 65 to increase recruitment). So if the used the data of patients in the trial, their SD is distorted and probably smaller than it would have been in a normal, non-selected sample.

    In the observational study by Crawley et al. the SD for physical function at baseline was 22.7. Take half of that and we get a minimal clinically significant different difference of 11.35, more than the change seen in the PACE trial for either GET (9.4) or CBT (7.1) compared to SMC.
     
    Snow Leopard and Andy like this.
  11. Dolphin

    Dolphin Senior Member (Voting Rights)

    Messages:
    5,792
    For what it is worth:
    Chalder fatigue (bimodal) >= 6 was an entry criterion in the PACE Trial.
     
  12. Lucibee

    Lucibee Senior Member (Voting Rights)

    Messages:
    1,498
    Location:
    Mid-Wales
    @Michiel Tack and @Simon M - using SDs of data that are either highly skewed (physical function) or that have skewed non-normal distribution (CFQ) is entirely flawed. To then use a CID that is so massively distorted by the scenario mentioned above by @hinterland and others makes no sense at all. The data do not warrant it being treated as analysable in any way, shape or form!
     
    MEMarge, Hutan, alktipping and 13 others like this.
  13. Trish

    Trish Moderator Staff Member

    Messages:
    55,414
    Location:
    UK
    I'm glad you said that, @Lucibee.

    I really appreciate all the excellent points @Michiel Tack has made in deconstructing the problems with the Cochrane review, but I disagree with trying to out-analyse them using their own preferred techniques.

    Better, I think, to simply say, as Lucibee does, that these techniques are not applicable to this sort of data. No serious statistician should have fallen into the trap that the Cochrane reviewers did of applying inappropriate tests to non linear, heavily skewed data.
     
    Last edited: Oct 11, 2019
  14. Barry

    Barry Senior Member (Voting Rights)

    Messages:
    8,420
    Yes. I cannot claim to follow all this by any means.

    But any analysis inevitably has to make certain assumptions about the data being analysed, because the tools being used are themselves based on various assumptions about the data. So a valid analysis should itself include justifications of why it is valid, explaining how/why the data is with an acceptable margin of error, given the analytical tools' assumptions. And of course if the data is outside that margin of error, then that itself should be be exposed. Would it be possible to do that with PACE, as a paper or article in itself, rooted in a thoroughly factual analysis?
     
  15. Esther12

    Esther12 Senior Member (Voting Rights)

    Messages:
    4,393
    It can still be a useful way of trying to think about and understand their work, but it's important to also remember to be critical of the preferences and judgements that lead to them conducting analyses of such questionable value. It's probably going to take us a while to get to grips with the new Larun review, and it's worth exploring all avenues.

    It seemed like Cochrane completely ignored many of the more fundamental concerns raised about their work, and looked only at the technicalities. I'm not sure what we can learn from that other than that Cochrane are rubbish. There's probably some lesson here.
     
    Hutan, alktipping, Annamaria and 5 others like this.
  16. Lucibee

    Lucibee Senior Member (Voting Rights)

    Messages:
    1,498
    Location:
    Mid-Wales
    Yes, and that is the problem. We are all assuming that these data are analysable. They are not.
    We assume that they are measuring what we think they are measuring. They are not.

    But how do we demonstrate that when there are no standard and robust measures of fatigue in existence? We can't.

    Because Chalder and co have published their scale, have (self) "validated" it, have used it in multiple trials and published papers on it for decades and decades, it is taken as read that it does what it says on the tin, because it gives them the results they want.

    And because others have also based their work on it, they are highly unlikely to support any kind of action against it. Who is going to publish such an article against the CFQ when those most likely to review it have a vested interest in its continued existence?

    And unfortunately, this is how science is supposed to "work".
     
  17. Dolphin

    Dolphin Senior Member (Voting Rights)

    Messages:
    5,792
    By the way, following a consultation in the US about instruments to use for research in the last year or two, the Chalder fatigue scale was dropped from the original list.
     
    Cheshire, MEMarge, Hutan and 21 others like this.
  18. Barry

    Barry Senior Member (Voting Rights)

    Messages:
    8,420
    Yes, I see what you mean. The divergence of the PACE data from a normal distribution will be significantly influenced by the non-linearity of the data. But the non-linearity of the data is hard to pin, because the data characteristic is pretty unknowable anyway, especially the Chalder FS. Is a '2' really twice as bad as '1'? Or a '3' just half as bad again as a '2'? Or is it logarithmic, so a '3 is twice as bad as a '2'? Or is it something far more likely ... nothing of any known characteristic, other than it gets bigger each time? So yes, uninterpretable. Silk purses out of sows ears.
     
    Last edited: Oct 11, 2019
  19. Snow Leopard

    Snow Leopard Senior Member (Voting Rights)

    Messages:
    3,860
    Location:
    Australia
    ¯\_(ツ)_/¯

    (caption - a giant shrug emoticon)
     
  20. Simon M

    Simon M Senior Member (Voting Rights)

    Messages:
    995
    Location:
    UK
    Certainly the SF36 SD problem has been well documented. I wasn't sure that CFQ SD had been shown to be "non-normal" (I think it's a pretty high threshold to reach, isn't the null hypothesis that every distribution is normal?).
     
    alktipping and ME/CFS Skeptic like this.

Share This Page