Rethinking the treatment of chronic fatigue syndrome—A reanalysis and evaluation of findings from a recent major trial of graded exercise and CBT

Carolyn Wilshire · Feb 7, 2018

arewenearlythereyet said:
.I also had fun trying to work out who wrote each bit in the discussion

The actual written text was all me. But different points were raised by different authors (big contributors were Tom, Alem and David). So you can have a bit of fun guessing who raised each point.

Tom: Insightful, sharp comments based on his wide understanding of the literature. Lots of nice references that extended and expanded on major points.
David Tuller: "Stop sugar-coating it!"

Also, good comments about the researcher's justifications, and the wider political picture.
Alem: Specific arguments based on their long history working with the data and PACE pubs. Incredibly smart guy.

LOTS of work went on behind the scenes. Many people not listed as authors shared their thoughts and their careful research, and commented on earlier drafts. I was able to take advantage of so much expertise relating to MECFS research more generally. This really was a community-wide effort.

Barry said:
Excellent! I especially like ...

Our analysis based of the protocol-specified outcomes indicated that GET produces modest enhancements in patients’ perceived physical function, but has little effect on symptom perception. Conversely, CBT improved symptom perception – specifically, self-rated fatigue scores – but had little effect on perceived physical function. If these interventions were operating to create a genuine underlying change in illness status, we would expect change on one measure to be accompanied by change on the other

Click to expand...

... which smacks very strongly of expectation bias, depending on what expectation the treatment had instilled. As the author's very insightfully identify, any real improvement would show no such discrimination.

This was interesting wasn't it Barry? Although we were only considering which comparisons reached significance and which did not. To really demonstrate that the two treatments differentially affected fatigue and physical function scores, we'd need to do some sort of test of the interaction between treatment and outcome.

I might have a look into doing this (but it probably won't be significant)

strategist said:
If I remember right, GET participants were told that GET would give them better physical function.

This is a good point too. Might be worth trawling trough the manual again to see what I can come up with here.

strategist said:
GET participant manual in the PACE trial, page 27 and 28

You've made it ever easier for me - thanks, @strategist!

Sly Saint said:
as SW said
"I don't mind people disagreeing on measures of recovery. They changed the recovery measure because they realised they had gone too extreme and they would have the problem that nobody would recover"

Thanks, @Sly Saint. Very interesting. If you know the source of that quote, I'd be interested to learn it.

Clara said:
Considering that an SF-36 physical functioning score of 85 is the bottom 7th percentile of the population, it is quite astounding that those "recovery" rates were 7% (CBT), 4% (GET), and 3% (control). Probably more than 7% of the population has chronic illnesses. The PACE trial in fact proves that CBT and GET do not work.

To be fair, the cohort we used to determine these figures excluded those with a significant long-term medical condition.

Its still really bad, though, isn't it?

Clara said:
I mean, silly PACE trial researchers, why didn't they go just a tiny bit further and make the recovery level 0? Then they would have gotten 100% recovery rates for any and all diseases and accomplished much more!

!

Simon M said:
Thank you for an impressive analysis that I think will come to be seen as a very important and influential piece of work.

Somewhere on the Wolfson website PACE section, the authors state clearly that this was an exploratory analysis of recovery. My understanding is that exploratory means after seeing and experimenting with the data.

Thanks, @Simon M. Yes, that's a good point, thanks for mentioning. Although they may be using the term more in the sense of "we're calling it 'exploratory' to get around the fact that we didn't use the definition we set out in the protocol".

Carolyn Wilshire · Feb 7, 2018

Esther12 said:
No method of correction was specified
in the trial protocol,...

Click to expand...

Does this mean that we can/can't say "using the prespecified analysis for the trial's primary outcome there was no significant treatment effect for [CBT and/or GET]"?

Or do we once again have to deal with niggling complexities which prevent a nice simple statement (ideally one suited to those of us not used to discussing Bonferroni correction)?

I think it would be hard for the researchers to come back and say 'no, we were never planning to use the Bonferroni method', because this is the method they used in all their published papers.

Besides, I think the reanalyses merely show that things weren't as rosy as they were made to appear, and they do that quite effectively as they are. The real fatal flaw, in my view, is in the failure of improvements to extend to objectively measured outcomes.

Esther12 said:
Using this definition, 11% of Control participants improved, compared to 22% and 21% of CBT and GET participants respectively.

Click to expand...

That is right on the edge of what they define as a clinically important difference in the full trial protocol:

"We propose that a clinically important difference would be between 2 and 3 times the improvement rate of SSMC."

Yes, but it has to be statistically significant too. And technically, neither of those percentages falls "between 2 and 3 times the improvement rate of SSMC" One is at 2 times the SMC improvement rate; the other is slightly below.

I have to go now, but will take a look at your other points later today (thanks, @Esther12, for taking the time to make them!).

Barry · Feb 7, 2018

Carolyn Wilshire said:
The real fatal flaw, in my view, is in the failure of improvements to extend to objectively measured outcomes.

Which, to me, cascades into what seems another underlying problem: The investigators apparent dismissal of objective outcomes as having any relevance or importance. It's as if their whole mindset is along the lines of:

We know ME/CFS is a psychological problem; psychological problems are only ever measurable subjectively, which has always been good enough to date, and always will be; objective outcomes don't feature in our world, and we can't understand what all the fuss is about.

Is my jaded view on it.

Esther12 · Feb 7, 2018

Carolyn Wilshire said:
I think it would be hard for the researchers to come back and say 'no, we were never planning to use the Bonferroni method', because this is the method they used in all their published papers.

Besides, I think the reanalyses merely show that things weren't as rosy as they were made to appear, and they do that quite effectively as they are. The real fatal flaw, in my view, is in the failure of improvements to extend to objectively measured outcomes.

Yes, I agree on the problem with subjective self-report/objective outcomes. Just always also on the look out for nice simple ways to summarise the problems with their spin, that also don't risk getting me accused of being misleading.

Carolyn Wilshire said:
And technically, neither of those percentages falls "between 2 and 3 times the improvement rate of SSMC" One is at 2 times the SMC improvement rate; the other is slightly below.

Ouch - that's so accurate and yet it feels so devious. It made me hurt.

Samuel · Feb 7, 2018

permit me a bizarre, over the top brainstorm. could it be that in some cases it is "the problem is that people are saying they are sick. we will train them not to do that. what's the fuss?"?

thus at root not a denial of disease [ontology] or credibility [epistemic worth]. those are irrelevant when the target population has little moral worth and are burdens to society. we have seen burden [and threat] language rise in both academic papers and newspapers.

afaik objective measures were irrelevant to the burden-to-society appeals that were used in action t4. must be careful with historical parallels. i am only suggesting that the ontological and epistemic claims [and objectivity itself] might be less relevant to mindset than we assume. @Barry

Carolyn Wilshire · Feb 8, 2018

Esther12 said:
"However, in May 2010, several months after data collection was complete, this primary outcome measure was replaced with two continuous measures: fatigue and physical function ratings on the two scales described above (see [13,14] for details)." I didn't know/had forgotten that there was a date available for that. Where was that from?

This was the date of approval of the change be the Trial Steering Committee. Note how close this date was to the submission of the primary 2011 paper. And also how very far this date was form trial commencement.

Esther12 said:
I'd have liked a mention of the fact that after the 2011 Lancet paper was released, they claimed that the results for the recovery criteria laid out in their protocol were due for publication in an academic journal. http://www.meassociation.org.uk/2011/05/6171/ http://www.meassociation.org.uk/wp-content/uploads/2011/06/FOI+from+Queen+Mary.pdf ] I know that there are too many problems to detail in just one paper though.

Gosh, this is good. I looked high and low for information about when and how that change was made, but never knew about this.

Esther12 said:
There's also this quote from Sharpe from 2011:
http://www.abc.net.au/radionational...son-of-treatments-for-chronic-fatigue/2993296 "I would just like to respond to the comment about data from measures listed in the protocol not being reported in the Lancet paper. This is simply because there is too much data to adequately report it in a single paper (the Lancet like most other journals has a strict word limit of 4000 words). There is a publication plan for this, so far unpublished, data which includes papers on: 'recovery' (careful reading will make it clear that recovery is not reported in the Lancet paper, longer term outcome, mediators and moderators of response, and economic aspects including employment. I hope this is helpful"

That's really interesting to know.

Some of this is understandable. There really was too much data for one paper. But it doesn't explain why they chose to report walking test results in the initial Lancet paper, and not fitness test results. The only reason for singling our the walking test here must have been that some of those results just passed the threshold for statistical significance. You will also note reading our paper that they never actually bothered to presented statistical analyses for the objective measures that showed nothing (fitness, employment, benefits). Sometimes they noted there wasn't much difference, but sometimes they just said nothing.

Esther12 said:
I don't really understand this sentence in the paper about this:
"Again, the timing of the change to the recovery definition – over a year after the trial was completed - is highly problematic."
Are you saying that the recovery definition was changed before the Lancet paper was released? Do you have access to some info on this that has not been made public, or am I misunderstanding you?

I think we were trying to be conservative. The 2013 recovery paper must have been written in early 2012 at the very latest (the first version of the paper was received by the journal in August 2012). And that must have been at least a year after the the 2011 Lancet paper was submitted (there's no submission date on the Lancet paper, so we can only guess, but it was certainly before Feb 18, 2011, which was the date of actual publication).

Thanks for picking up the typos!

"Patients do just as well with some good basic medical care."

Perhaps "good" is too generous. But it did seem to us to have been better than what PwMEs commonly get in the UK in general. At least patients were given medication for pain and sleep.

Esther12 said:
This reminded me that we're still waiting for some LTFU data, eg employment.

If were a gambling woman, I'd put a lot of money on null outcomes here. If they had been positive, we would certainly have heard about them by now!

Esther12 said:
Was the decision to keep that 'merely' in there at least partly for the fun of pushing this peer reviewer to follow through on their promise to slag you off for it?

https://jcoynester.wordpress.com/20...bused-by-a-peer-reviewer-and-silenced-by-bmj/

Haha, I see what you mean! Yes, this paper was rewritten from that earlier one that was reviewed so inadequately in the BMJ. Although just about every paragraph ended up being different, a couple of phrases seem to have slipped through!

Do you think "merely" is problematic? Its possibly something we could change at the proofs stage if it were.

Carolyn Wilshire · Feb 8, 2018

Samuel said:
permit me a bizarre, over the top brainstorm. could it be that in some cases it is "the problem is that people are saying they are sick. we will train them not to do that. what's the fuss?"?
@Barry

I think that's spot-on, @Samuel.

To them, the whole illness is a problem with the way you think. So what needs to be done is to change the way you think.

Again, its a sort of begging-the-question situation: its all fine if you believe the problem is psychological. The conclusion rests on assumptions about illness cause.

Daisybell · Feb 8, 2018

Carolyn Wilshire said:
I think that's spot-on, @Samuel.

To them, the whole illness is a problem with the way you think. So what needs to be done is to change the way you think.

Again, its a sort of begging-the-question situation: its all fine if you believe the problem is psychological. The conclusion rests on assumptions about illness cause.

But - even if that is the belief - what the data keeps showing is that even if you change what people think, it doesn’t actually change their behaviour. They aren’t more active, they don’t get back to work, they don’t use fewer resources in terms of benefits etc... so that must be seen as a failure.

I don’t think non-complaining sick people is usually the goal, if the research is sponsored by the DWP! Fine for the health service, but not otherwise.

Esther12 · Feb 8, 2018

Carolyn Wilshire said:
This was the date of approval of the change be the Trial Steering Committee. Note how close this date was to the submission of the primary 2011 paper. And also how very far this date was form trial commencement.

Thanks. I didn't know about that date.

Carolyn Wilshire said:
Some of this is understandable. There really was too much data for one paper. But it doesn't explain why they chose to report walking test results in the initial Lancet paper, and not fitness test results. The only reason for singling our the walking test here must have been that some of those results just passed the threshold for statistical significance. You will also note reading our paper that they never actually bothered to presented statistical analyses for the objective measures that showed nothing (fitness, employment, benefits). Sometimes they noted there wasn't much difference, but sometimes they just said nothing.

Yeah - and as you say in your paper, they could have released results for the outcomes in their protocol, and then also presented additional analyses. Sharpe's comment implies all those results were to be released...

Carolyn Wilshire said:
I think we were trying to be conservative. The 2013 recovery paper must have been written in early 2012 at the very latest (the first version of the paper was received by the journal in August 2012). And that must have been at least a year after the the 2011 Lancet paper was submitted (there's no submission date on the Lancet paper, so we can only guess, but it was certainly before Feb 18, 2011, which was the date of actual publication).

I see now. @Barry explained that I'd misunderstood that, so it seems it was a problem at my end.

Carolyn Wilshire said:
Do you think "merely" is problematic? Its possibly something we could change at the proofs stage if it were.

I'm sure you'd know better than I. It just stood out and amused me because that peer reviewer seemed to think that it's usage marked you out as the work of the devil. When I saw you commenting under Coyne's blog I wondered then if you would do everything possible to ensure that phrase made it through to the final paper!

Great work. Thanks for the clarifications. I think that for now I might leave it to others to start boldly asserting that the pre-specified analysis for PACE's primary outcome show no difference between groups, and see how they get on before I dive in too.

Barry · Feb 8, 2018

Carolyn Wilshire said:
But it doesn't explain why they chose to report walking test results in the initial Lancet paper, and not fitness test results. The only reason for singling our the walking test here must have been that some of those results just passed the threshold for statistical significance.

Also worth keeping in mind that one of the PACE participants recently observed (cannot recall where, maybe in a response to one of DT's blogs?) that even the objective measures were not really that objective, because in order to meet the trial's physical demands, they backed off from some of their other physical activities, robbing Paul to pay Peter. Trial design issues again.

Adrian · Feb 8, 2018

Esther12 said:
Are you saying that the recovery definition was changed before the Lancet paper was released? Do you have access to some info on this that has not been made public, or am I misunderstanding you?

I had assumed that the recovery definition was basically dropped from being a secondary outcome when the stats analysis plan was approved as it failed to mention recovery. I don't think that plan ever acknowledged the changes or gave reasons but just did them. I wondered if they never got explicit approval for the actual changes.

Daisymay · Feb 8, 2018

Daisybell said:
I don’t think non-complaining sick people is usually the goal, if the research is sponsored by the DWP! Fine for the health service, but not otherwise.

Quite and the COI of the PI's too.

Sly Saint · Feb 8, 2018

Carolyn Wilshire said:
Thanks, @Sly Saint. Very interesting. If you know the source of that quote, I'd be interested to learn it.

https://www.s4me.info/threads/prof-...tific-bbc-radio-4-14-feb-2017.991/#post-29947

Simon M · Feb 8, 2018

Carolyn Wilshire said:
Thanks, @Simon M. Yes, that's a good point, thanks for mentioning. Although they may be using the term more in the sense of "we're calling it 'exploratory' to get around the fact that we didn't use the definition we set out in the protocol".

i’m probably going on about this unnecessarily, but...
the recovery definition used is based around the “normal“ range for the primary outcomes of fatigue and function. This normal range was explicitly labelled as post hoc in the 2011 Lancet paper.

Now, they didn’t need trial data to create the erroneous “normal“ range, but I think somebody mentioned that the authors claimed it was a reviewer who insisted using this range in the Lancet paper. If that’s the case, it was surely created after data analysis - and therefore the recovery paper itself must have used a recovery definition created after sight of the data.

The May 2010 dates that you mentioned for trial steering group approval of changes also coincides, I think, with date unblinding. So presumably they will say they got the changes approved, then did the analysis.

Guest 102 · Feb 8, 2018

Carolyn Wilshire said:
Hello all,

I'm pleased to report that our major critique and reanalysis of the PACE trial has been accepted for publication in BMC Psychology.

Title: Rethinking the treatment of chronic fatigue syndrome—A reanalysis and evaluation of findings from a recent major trial of graded exercise and CBT

Authors: Me, Tom Kindlon, Alem Matthees, Robert Courtney, Keith Geraghty, David Tuller, and Bruce Levin.

Here is the abstract:

The fully formatted version will be available soon at the journal website and will be open access (I'll post a link as soon as one's available). But for those who can't wait that long, here is my own version, which the journal rules allow me to circulate.

Or you can download it here:
https://www.researchgate.net/public...recent_major_trial_of_graded_exercise_and_CBT

Thanks to all those not mentioned in the author list who contributed by reading our drafts, answering our questions, and discussing the issues with us.

Thank you, Carolyn and others - I haven't read yet - but you are stars, all of you, for continuing to challenge the atrocity of PACE. Am so grateful.

Barry · Feb 8, 2018

Sly Saint said:
as SW said "I don't mind people disagreeing on measures of recovery. They changed the recovery measure because they realised they had gone too extreme and they would have the problem that nobody would recover"

Very much akin to "I'd entered a race I was not good enough for, so if I'd not tripped the other bloke up I would not have won."

Alvin · Feb 8, 2018

Barry said:
robbing Paul to pay Peter

I think this is the gist of many PACE type successes, we can do more then we would by choice/experience but we pay for it later, just report the doing more, ignore the consequences, ignore the resulting permanent deterioration and claim its a cure

Then use that "cure" to harm more patients.

Barry · Feb 8, 2018

Alvin said:
I think this is the gist of many PACE type successes, we can do more then we would by choice/experience but we pay for it later, just report the doing more, ignore the consequences, ignore the resulting permanent deterioration and claim its a cure
Then use that "cure" to harm more patients.

But in the case I was talking about, somebody had said the participants had in some cases had to stop doing some of their normal activities so they could maintain their activities for Peter White's PACE trial. So even if it looked like some objective measures were similar at 52 weeks to baseline, the person might in fact have been significantly worse overall. PACE was only selectively sampling physical activity change, with no real handle on the participants' overall activity change.

Adrian · Feb 8, 2018

Simon M said:
i’m probably going on about this unnecessarily, but...
the recovery definition used is based around the “normal“ range for the primary outcomes of fatigue and function. This normal range was explicitly labelled as post hoc in the 2011 Lancet paper. Now, they didn’t need to trial data to create the erroneous “normal“ range, But I think somebody mentioned that the authors claimed it was a reviewer who insisted using this range in the Lancet paper. If that’s the case, it was surely created after data analysis and therefore the recovery paper itself must have used a recovery definition created after sight of the data.

The May 2010 dates that you mentioned for trial steering group approval of changes also coincides, I think, with date unblinding. So presumably they will say they got the changes approved, then did the analysis.

Recovery was a secondary outcome in the protocol. As I see it the stats plan in May 2010 replaced the protocol in terms of the analysis that was done and if I remember correctly contained no mention of recovery. This then allowed them to use an adhoc recovery definition when they wrote the recovery paper. So I think they lined up two different decisions.

Alvin · Feb 8, 2018

Barry said:
But in the case I was talking about, somebody had said the participants had in some cases had to stop doing some of their normal activities so they could maintain their activities for Peter White's PACE trial. So even if it looked like some objective measures were similar at 52 weeks to baseline, the person might in fact have been significantly worse overall. PACE was only selectively sampling physical activity change, with no real handle on the participants' overall activity change.

That makes sense, so that makes two mechanisms, overdoing then paying for it and reallocating.
I admit that i have done both

Rethinking the treatment of chronic fatigue syndrome—A reanalysis and evaluation of findings from a recent major trial of graded exercise and CBT

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Administrator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Guest 102

Guest

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Administrator

Senior Member (Voting Rights)