Graded exercise therapy for ME/CFS is not effective and unsafe. Re-analysis of a Cochrane review (2018) Health Psychology / Vink

Here's Cochrane's policy on withdrawing reviews (this came up on another thread):

https://community.cochrane.org/edit...ublished-cochrane-reviews-including-protocols

Apparently a review can be withdrawn because of quality issues. :)

The policy says, 'The decision to withdraw a published Cochrane Review (or protocol) should generally be made between the authors and the Cochrane Review Group (CRG) editorial team.'

This sounds a bit too chummy. If a review is at fault and its authors refuse to acknowledge that fault, someone needs to step in and force withdrawal.

I've lost track of what advocacy has been done in relation to Cochrane. Has Cochrane been asked to withdraw this review, @Mark Vink, @dave30th, @Jonathan Edwards? If not, is it worth making a call in relation to this policy?

Process in the event of serious errors in published Cochrane Reviews
https://community.cochrane.org/edit...ent-serious-errors-published-cochrane-reviews
 
Nobody else that I can see has mentioned the title : "Graded exercise therapy for ME/CFS is not effective and unsafe."

In my opinion this is poor and confusing English. It has a word missing, and should say : "Graded exercise therapy for ME/CFS is not effective and is unsafe."

Anyone else agree? Or am I making a mountain out of a molehill?
Definitely agree, I’ve been wondering whether to say something, but figured it’s probably too late to change once it’s published. It is something that should have been discussed with the author by the publishers or reviewers. Thankfully the paper is clear even though the title isn’t.
 
Nice one. Thanks to the authors.

I expected it to draw more on the points made by Courtney and Kindlon in the comments section of the Cochrane review, but it's really quite different, focussing more on the problems with the studies included in the Cochrane review, so it works more as a companion piece to what has already been done than a summary of the problems with the Cochrane review.
While it doesn't mention some of the specific points I made/issues I raised, it does mention others, including many of the specific citations I used.

But I don't make any money from writing papers, comments and the like so I'm quite happy to see them being used for the cause. And a number of my own ideas over the years have been picked up from other publications and discussions on forums.

Also, he does cite some of my publications and in other cases it works better to cite the original source rather than my publications.

Mark does brilliantly to write these papers, despite his severe impairments. I'm very grateful to him for doing so. Thanks also to Alexandra Vink-Niese for her contribution. I am not familiar with any other work of hers on ME/CFS.
 
I thought this was very good. However, I just thought I would also highlight some minor points I was not convinced about.

Moreover, around 40 per cent of participants in seven of the eight studies (Wallman et al., 2004, the exception with 12%) suffered from co-morbid psychiatric disorders. The presence of a medical or psychiatric condition that may explain the chronic fatigue state excludes the classification as CFS in research studies because overlapping pathophysiology may confound findings specific to CFS (Reeves et al., 2003).

I've come across very few CFS studies which completely exclude people with psychiatric comorbidities such as depression (except major depressive disorder) and anxiety, though some stratify the samples in some or all of the analyses. No harm raising the issue, but just to say that the review is not different to most studies in including some people who also had mental health issues.
 
Further questions about inclusion are raised in Moss- Morris et al. (2005), where 77.6 per cent of patients were well enough to be in work, and in both Fulcher and White (1997) and Moss-Morris et al. (2005), where participants had normal VO2max scores.

Just to point out actual data in the trial:

Of the 49 patients eligible to participate, 70.9 per cent were female, and 22.4 per cent were unemployed and unable to work due to disability. Ages ranged from 19 to 60 years (mean age 40.9 years). The median duration of illness was 3.08 years, ranging from 6 months to 45 years.

So 77.6% weren't necessarily working. Some people could be in a bit of a grey area, e.g. work in the home.
 
All trials in the review, apart from Powell et al. (2001), used objective outcomes, so it would have been possible for the Cochrane review to have used them.

Yes, objective outcomes are very important in open-label trials like this. As well as using two subjective outcome measures has to primary outcomes, they also reported on lots and lots of other subjective outcome measures as secondary outcome measures.

The only objective outcome measure they reported on was health resource use for which they presented follow‐up data from one trial.
 
Only 6 per cent dropped out in the GET group of White et al. (2011); however, according to the supplement to the secondary mediation paper, there were missing step test data for 34 per cent (GET) (Chalder et al., 2015), which may have inflated any improvement in the GET group on that test. These dropouts add further doubts about the reliability of the review’s findings.
This can give the impression that there was an improvement in the step test results, but there wasn't.

It might have been more interesting to refer to the figure for the 6-minute walk test where there was a statistically significant improvement for the graded exercise therapy group, but there was missing data for 31%.
 
Because of their separate effects on fatigue, Fulcher excluded patients with a current psychiatric disorder or symptomatic insomnia, apart from simple co-morbid phobias, yet 30.3 per cent (20/66) in the trial were on full dose antidepressants. Symptomatic insomnia (sleep reversal and/or unrefreshing sleep) is a common and important symptom of CFS, so excluding such patients would seem to have excluded patients with a common symptom of CFS.
 
Participants in the exercise group had sessions of 5–15 minutes, increasing to a maximum of 30 minutes, at least 5 days a week. Such a workload would exclude most patients with CFS.
That is true. Researchers sometimes mention this figure of 30 minutes, 5 days a week, and this gets mentioned when describing the programme. But it's not clear that many, or even any participants actually did this amount of exercise so such a goal would necessarily exclude the relevant population.

I remember Peter White said somewhere that the figure of 30 minutes 5 days a week was chosen because that is the figure in guidelines for the whole population in the UK. However, it is very questionable to suggest this for people with an unusual response to exercise. I think exercising every second day, while missing some days if necessary, is possibly the most suitable program. I remember reading somewhere that after two days without exercise you start losing fitness.
 
First, it used too broad criteria to select patients. Not only did it use the Oxford criteria, but it set a physical functioning score of <83.3 to designate caseness and 46 per cent of participants had a current psychiatric diagnosis.
Just to point out that while the threshold of less than 83.3 is certainly very high, there were a lot of other criteria also:

Subjects were assessed on entry to the trial (week 0) and at weeks 12 and 26. Subjects completed three self-rated questionnaires. These were: (a) the 14-item fatigue scale (Chalder et all 1993) - we used a cut-off of four or more to designate caseness; (b) the Medical Outcomes Survey Short-Form Scales (MOS; Stewart et al, 1988) which produces a measure of general health status on the following six scales (cut-off scores for poor function in parentheses): physical function ( < 83.3), role or occupation function ( < =5O), social function ( < =40), pain ( < =5O), health perceptions ( < =70) and mental health ( < =67); (c) the Hospital Anxiety and Depression Scales (HAD; Zigmond & Snaith, 1983) - cut-offs of 11 or more designated caseness.
 
Third, since more participants dropped out from and fewer complied fully with the GET groups, it would seem impossible to draw any safe conclusions. In total, 37.3 per cent dropped out of the two GET groups combined and only a small percentage (34.3%) of participants complied fully with GET. In contrast, only 21.7 per cent dropped out in the two exercise placebo groups and 78.3 per cent complied fully with exercise placebo. The difference in these two rates is particularly a concern since patients who dropped out were significantly more likely to have changed or given up their occupation as a result of their illness.
Yes, it seems interesting.

From the full Wearden et al, 1998 paper

Drop-out rate and characteristics
One hundred and fourteen (84%) subjects completed three months and 96 (71%) subjects completed six months of the trial. Subjects dropped-out at a greater rate throughout the six months with exercise than with non-exercise (25/68 (37%) v. 15/69 (22%), Cox's proportional hazards, P=O.OS)). The difference in drop-out rate between subjects allocated to fluoxetine (24/68 (36%)) or to placebo drug (16/69 (24%)) did not reach significance. Eleven subjects dropped out because of medication side-effects (two taking placebo drug), 16 because they were not improving or feeling worse, and 13 gave other reasons or no reason for dropping-out.

Drop-outs were significantly more likely than trial completers to be members of a self-help organisation (15/39 (39%) v. 20/95 (21%), x2 (1)=4.34, P=0.04), to have changed or given up their occupation as a result of their illness (38/40 (95%) v. 76/96 (79%), zZ (1)=5.22, P=0.02) and had significantly worse baseline scores on the MOS health perception scale (median (interquartile range) 5.0 (18.8) v. 15.0 (18.8) Mann-Whitney U=1433, P=0.02). There were no significant differences between drop-outs and trial completers on any other demographic feature nor baseline clinical measure.
 
Furthermore, the mean fatigue scores at the end of treatment in the two GET groups were 29.9 and 28.0 so that patients were still ill enough to meet the entry criteria for the trial (cut-off of 4 or more to designate caseness; 0–42).
I would be careful about quoting the figure of 4 or more with regard to Likert scoring i.e. 0-42.

With the 11-item version, healthy people score an average around 11 which is the 11 items, where people on average, said "No more than usual" which scores one. "0" is for "Less than usual". So 4+ would be a ridiculous cut-off point.
Presumably the 4+ threshold refers to bimodal scoring i.e. where the total score is 0-14. Nowadays only 11 items are usually used and the threshold is 4 or more on bimodal scoring.

The Wearden et al paper itself probably uses both bimodal scoring and Likert scoring but doesn't look like it makes it clear when it is using each one
 
Powell et al. (2001)
Similar problems afflict Powell et al. (2001). The entry physical functioning score was <25, a score of 25 deemed similar to normal daily functioning for the UK general population. However, the main outcome measure and predetermined criterion for clinically important improvement was a score of 25 or more or an increase of >10 on the SF-36 physical functioning subscale (range: 10–30) 1 year after randomisation. In other words, a patient could enter the trial with a score of 24, improve to 25 over the course of the trial, and this minimal improvement would be deemed clinically important.
I checked this and it is true, though the baseline mean was 16.0.

Outcome measures

Patients were sent questionnaires containing validated measures of outcome by post before randomisation and at three, six, and 12 months. Primary outcomes were scores on the physical functioning subscale of the SF36 questionnaire and on the fatigue scale (range 011, scores > 3 indicate excessive fatigue).17 The predetermined criterion for clinically important improvement at one year was a score of >25 or more or an increase of >10 from baseline on the physical functioning scale (range 10 to 30). This is similar to normal daily functioning for the UK general population.18 At baseline the mean score for physical functioning was 16.0.

24 is equivalent to 70 on the 0-100 scale, 25, is equivalent to 75 on the 0-100 scale and 16 is equivalent to 30 on the 0-100 scale.
 
Powell et al. (2001)
The trial employed non-equivalent controls which favoured the interventions. A minimum number of sessions were stipulated for the treatment groups: 3 hours face-toface and 1 hour telephone contact for the minimum intervention group, 3 hours face-to-face and 4.5 hours telephone contact for the telephone intervention group and 10 hours face-to-face and 1.5 hour telephone contact for the maximum intervention group. There was no such specification for the control group. It is possible those in the control group had zero hours of contact and it was therefore no more than a waitlist group. This was recognised in the article where it is acknowledged that one of the limitations was the lack of a placebo control group that received equivalent therapist time and attention.

Yes indeed:
Patients in the control group received standardised medical care. This comprised a medical assessment, advice, and an information booklet that encouraged graded activity and positive thinking but gave no explanations for the symptoms. Patients were advised that they would be sent a questionnaire to assess their progress at three, six, and 12 months and discharged back to primary care.
 
Back
Top Bottom