Video: The PACE trial: a short explanation, Graham McPhee

Thanks for explaining @Graham. It's never made any sense why their "recovery" criteria are in many ways less stringent than the criteria for "improvement" - hence my confusion - I had assumed (clearly wrongly) that they had at least included those who had improved! I have looked at the data though, and no-one meets that criteria of entering with a score of 65 and "recovering" with a score of 60. There were 3 pts that registered these scores (65 at entry, 60 at exit), but all scored over 18 in "fatigue".

I guess I'm trying to understand why anyone would still want to defend these studies, and why they fail to see the flaws.
 
Thanks for explaining @Graham. It's never made any sense why their "recovery" criteria are in many ways less stringent than the criteria for "improvement" - hence my confusion - I had assumed (clearly wrongly) that they had at least included those who had improved! I have looked at the data though, and no-one meets that criteria of entering with a score of 65 and "recovering" with a score of 60. There were 3 pts that registered these scores (65 at entry, 60 at exit), but all scored over 18 in "fatigue".

I guess I'm trying to understand why anyone would still want to defend these studies, and why they fail to see the flaws.

The recovery paper was a big own goal for them because it is indefensible and some trivial level of thought would have told them that. There statements around it suggested they knew this as well. I can see with the main paper they could claim they just don't understand/agree with the methodological concerns.
 
I have looked at the data though, and no-one meets that criteria of entering with a score of 65 and "recovering" with a score of 60.
Yes, that's true. But the second video was about the faults in their use of very basic statistics rather than the actual results. My overall idea was that the first video showed that the results were pretty irrelevant – so in that sense, whatever they did with them was meaningless: the scores were just too easily manipulated. The second video is there to challenge their status as "experts": when they can use an unmatched and unhealthy sample to calculate a recovery target, and use the mean and standard deviation as measures for norms on a heavily skewed set of data, they cannot expect any respect for their "skills".

The point about the overlap is not that any patients fell into that category but that after blundering with the comparison sample, and blundering with the calculation, the result was so idiotic, it should have screamed "error" at them.

I'm a bit concerned that my intention in the second video is not coming across: that the second video may be perceived to be more about recovery claims than about their poor understanding of statistical techniques. Perhaps I need to spell that out in video 3, which draws a few things together.

I am being kind here with my remarks about the PACE authors. It strikes me that there are two options, and that different ones may apply to different authors. The first is that they were unskilled and really didn't know that they had blundered. The second is that they are only concerned about their own status and preserving their skins, and so were happy to manipulate the situation, knowing the calculations were wrong.
 
I think I might have an inkling how the reduction to 60 occurred conceptually - if you assume that the whole "normal (not normal) range" thing was more of a post-hoc explanation on their part.

The mean PF scores in the groups at baseline was about 40. What they *thought* they were doing (aiming for) was getting the mean *group* scores above 60 to demonstrate recovery/improvement (whatever). I suspect something got lost in translation, and it got applied as an individual threshold instead without properly understanding the consequences of that.

But whatever they did, it shows a striking lack of understanding of basic stats, as you say.

I'm just trying to understand it from their point of view - because it might make it easier to explain to others why the mistake was made.
 
My cynical head says they said to themselves 'what percentage improvement and recovery do we want this trial to show?'. Then they took a sneaky peek at the results coming in and realised they weren't going to come anywhere near the 60% improvement, 20% recovery they wanted, so they worked out what level would give the desired results, and made up a post hoc justification to 'explain' the changed boundaries.

Wessely's adjusting the route of a ship part way through a journey to ensure reaching the correct destination analogy says it all really.

Cynical? Moi? :(
 
We can only guess at their motivation and ignorance, but to me it goes along the lines of them realizing that their results weren't good enough, but, like a fervent believer trusting in the truth of their belief, they felt that something must be wrong with the targets. So when one of them came up with a comment that they had been playing about with the Bowling data, and discovered that the "normal range" was 60 to 100, they all jumped on board with this without ever bothering to check it out.

The fact that they didn't check it out (and again, I'm being generous here – it could be that the process was deliberate) could suggest that the person coming up with this idea was pretty senior. Wasn't Wessely on the statistics oversight committee or something like that? Could that be relevant? The truth is that we will never know: they close ranks tighter than the Roman army forming a tortoise.
 
I am being kind here with my remarks about the PACE authors. It strikes me that there are two options, and that different ones may apply to different authors. The first is that they were unskilled and really didn't know that they had blundered. The second is that they are only concerned about their own status and preserving their skins, and so were happy to manipulate the situation, knowing the calculations were wrong.
A published letter in the Lancet in 2011 highlighted the data wasn’t for people of working age which they responded to and accepted yet repeated claim in 2013 paper. Also Peter White co-authored a paper on full recovery in CFS which said SF-36 not normally distributed.
 
Last edited:
Sorry to be nit-picky too, but there is an issue with the whole recovery cut-off thing that keeps being ignored. It seems to have become a bit of a meme this thing that you could enter the trial with a score of 65 and yet be recovered with a score of 60. It's just not true. The recovery criteria also required an improvement of 20 points to qualify as recovered (in addition to a 8 pt decrease on CFQ and no longer meeting Oxford criteria). I know it's a small thing, but the PACE authors will always have an advantage if that "flaw" keeps being used.
I don’t recall an 8-point improvement on the CFQ being used for anything. 8 points was the threshold for improvement on SF36 PF. My only recollection of a 20 point difference was as one of the harms measures, never as a measure of improvement.
 
Thanks @Andy : I've just been loading and checking the subtitles and came across it. There's a word I said to myself. I'm trying to find out if I can edit it in Youtube, whether I have to upload it all again, or whether simply to issue an apology underneath! I didn't think many people would be that quick off the mark.


Thank @Trish . It's a tricky one that I puzzled over. The scale only permits you to score multiples of 5, so strictly speaking the borders are correct and the green and blue rectangles cover the correct area. I decided in the end that people would listen to it and not bother too much about things like that - it is for those who find the usual analyses too heavy going.

I think, though, @Graham it needs to be consistent. You point at the mark after 65 for 65, but before 60 and before 85 for those numbers. They should all either be pointing before or after.

Great job again, though, thanks.
 
Yes, that's true. But the second video was about the faults in their use of very basic statistics rather than the actual results. My overall idea was that the first video showed that the results were pretty irrelevant – so in that sense, whatever they did with them was meaningless: the scores were just too easily manipulated. The second video is there to challenge their status as "experts": when they can use an unmatched and unhealthy sample to calculate a recovery target, and use the mean and standard deviation as measures for norms on a heavily skewed set of data, they cannot expect any respect for their "skills".

The point about the overlap is not that any patients fell into that category but that after blundering with the comparison sample, and blundering with the calculation, the result was so idiotic, it should have screamed "error" at them.

I'm a bit concerned that my intention in the second video is not coming across: that the second video may be perceived to be more about recovery claims than about their poor understanding of statistical techniques. Perhaps I need to spell that out in video 3, which draws a few things together.

I am being kind here with my remarks about the PACE authors. It strikes me that there are two options, and that different ones may apply to different authors. The first is that they were unskilled and really didn't know that they had blundered. The second is that they are only concerned about their own status and preserving their skins, and so were happy to manipulate the situation, knowing the calculations were wrong.

That raises the question about how well briefed they were on the TSC and DMEC when they agreed to the changes to the protocol, and the minutes suggest they were not at all briefed.
 
Is it sufficient to look at "normal range".
Agreed. The trouble is that in the PACE trial there are just too many issues: too many faults: too many poor decisions. I'm trying just to focus on two simple but major errors that in themselves would be sufficient to destroy any proper scientific study.

The authors are well-practised at sidestepping such criticisms and leading away into other issues. I'm trying to restrict the conversation just to two areas – the reliance on easily manipulated subjective assessments that are contradicted by physical assessments, and the failure to carry out the most basic of statistical techniques properly. Unreliable data casts doubts on all of their conclusions, as well as that of the many similar, earlier trials. Failure with the statistics casts doubts on any claims that they may make about scientific standards of of expertise.

Or of course, we could go down the "of course they knew, but..." route!
I think, though, @Graham it needs to be consistent. You point at the mark after 65 for 65, but before 60 and before 85 for those numbers. They should all either be pointing before or after.
Actually, I am being consistent: but of course it may not be clear enough. The "arrow" acts as a guillotine, separating each set of scores, and as such cuts the range in the right place. The blue and green rectangles cover the required ranges.

The problem is that most people read the horizontal axis as a scale, rather than as discrete values, as though you could plot 63.5

The problem is that, from what I can see, I can only scrap it and start the upload process again, in which case the link will be different. I'm not sure that the confusion here and the glitch in the sound track are big enough to warrant that. I'm open to persuasion.
 
We can only guess at their motivation and ignorance, but to me it goes along the lines of them realizing that their results weren't good enough, but, like a fervent believer trusting in the truth of their belief, they felt that something must be wrong with the targets. So when one of them came up with a comment that they had been playing about with the Bowling data, and discovered that the "normal range" was 60 to 100, they all jumped on board with this without ever bothering to check it out.

The fact that they didn't check it out (and again, I'm being generous here – it could be that the process was deliberate) could suggest that the person coming up with this idea was pretty senior. Wasn't Wessely on the statistics oversight committee or something like that? Could that be relevant? The truth is that we will never know: they close ranks tighter than the Roman army forming a tortoise.

No, Wessely wasn't on any of the trial committees.

I think this idea of getting the evidence to fit their expectations is absolutely right. They started the whole process as we saw from their application to the MRC with certain results in mind. Hey presto, what did their (amended) analysis show: the results they anticipated all along.
 
Actually, I am being consistent: but of course it may not be clear enough. The "arrow" acts as a guillotine, separating each set of scores, and as such cuts the range in the right place. The blue and green rectangles cover the required ranges.

The problem is that most people read the horizontal axis as a scale, rather than as discrete values, as though you could plot 63.5

The problem is that, from what I can see, I can only scrap it and start the upload process again, in which case the link will be different. I'm not sure that the confusion here and the glitch in the sound track are big enough to warrant that. I'm open to persuasion.

It's up to you. I can only say that I didn't see it how you describe.

It is great work, Graham. I am being hypercritical after you done so much
 
Summary of this post: Maths teacher being nerdy. Best ignored.

@Graham, your scale and numbers showing the cut offs at 65, and the 85 moving down to 60 is ambiguous - I think the latter seems to point at 55. Probably doesn't matter, the point is made clearly anyway.
Excellent video again, thank you. And I like your dig at the end.

I have watched the section I disputed again and now think you were right and I was wrong. Apologies, @Graham.

The coloured areas you show as inclusive in each case are correct, and the pointers are correct. The problem comes with the tricks the eye plays on a marked scale with numbers between the marks. So on your | 60 | 65 | 70 | etc. I read the numbers as applying to the vertical markers, when in fact the vertical markers are midpoints between the numbers.

So for example when you say 65 or below, you have to point to the marker to the right of the 65, so 65 is included, whereas when you say 60 or above, you point to the marker to the left of the 60. That's how you end up with an apparent gap of 10 points between 60 and 65.

Sorry to have been so dense before - my excuse is that as a Maths teacher showing students how to mark a scale, you would never place the numbers between the marked points of the scale, so my eye played tricks.

If I were to try to make it less ambiguous I might remove the interval markers on the scale and just have the series of numbers. Since it's not a linear scale anyway, that might be a good solution.

However, let me hasten to add, I'm not suggesting you do it again. It's fine as it is. I made a fuss about nothing. :)
 
I am nothing but grateful for all your inputs, especially when you disagree with me! My reason is simple: I like you, respect you, and know we are on the same side. If something sits uneasily with you, or if you think I have got something wrong, then I need to get it right for people who aren't on our side.

It's only the hassle of changing the link to the video that puts me off re-editing the problems you all mention. It's trying to decide whether the additional clarity is worth a changed link. If I could find a way of uploading the video to the same link, I would improve the video.

If any of you are a basic Youtube user and know how to achieve that, I'd be pleased to hear.
 
Don't even try to change it @Graham, it's absolutely fine, and, like the first one, makes a very important point clearly and succinctly. I love all the little quips and digs, like the Maths is so simple even the PACE people should understand it.
Looking forward to the final episode. :)
 
As people were nitpicking, I thought I'd quickly listen through and point out any possible problems. Just one, and a couple of things that could be close to the edge if you were being extra cautious:

re claim of 'people being aged between 18 and early sixties', there was actually one old participant included too. Data on ages here: https://www.bmj.com/content/347/bmj.f5963/rr/675527

re 'they decided that it meant that two-thirds of scores were between 60 and a hundred' - the PACE people never explicitly said something like that, although their use of 'normal range' did imply it.

re 'so they set it at 60 as, according to them, that's where many healthy folk would be' - I think that them using 60 as a cut off for their recovery criteria does kind-of imply this, but they didn't exactly say it.

I'm going to have to re-watch now. I feel like paying attention for possible problems meant that I didn't really take in the argument. Thanks Graham.
 
Back
Top Bottom