A general thread on the PACE trial!

Esther12 · Jun 21, 2018

Barry said:
I think the political arena is very different, and I'm not sure linguistic perfection is quite so important as getting broad messages across. The BPS crew have been controlling the narrative for so long, with their manipulation of the science/medical debate via their SMC lapdog. But the SMC has negligible sway in the political arena that I'm aware of. SW is a very political animal I think, and it will be interesting to see how he plans to try and retain favour with his funny-handshake chums I don't doubt he nurtures behind the scenes.

In this case it meant that QMUL was able to completely evade the real concerns about access to PACE data, and present Monaghan's concerns as unfounded. I think that getting the details right at this stage is still going to be very important.

large donner said:
There was some data that was previously claimed by the holder to have been lost from a filing cabinet of some sort.

I am 100% sure of that.

It has been discussed on the forums about how bad it was that they failed to secure the said files effectively. Not sure if it was QMUL but I am pretty sure some kind of files were either claimed to have disappeared or lost and I remember Peter White being at the centre of it.

That was recordings of therapy sessions.

Dolphin · Jun 21, 2018

FWIW

Snow Leopard · Jun 21, 2018

QMUL are not claiming the data is lost, they're claiming a much more sophisticated version of dog-ate-my-data defence.

They're claiming that the statisticians have moved on and it is impossible for them to get another statistician to access the data. Yes, really.

large donner · Jun 21, 2018

Snow Leopard said:
QMUL are not claiming the data is lost, they're claiming a much more sophisticated version of dog-ate-my-data defence.

They're claiming that the statisticians have moved on and it is impossible for them to get another statistician to access the data. Yes, really.

Which really means it will be impossible for us to salvage our reputation if we give this to another statistician who doesn't cook the books like Peter White.

alex3619 · Jun 21, 2018

I may have found a new hole in the PACE trial. I have been looking into the theory of statistics, even though I cannot do math myself any more.

I don't know if this has already been looked at.

Preamble

We already know that there was no significant difference at long term follow-up.

We already know they used the wrong data set for SF36PF in calculating SD and normal.

We already know that SD is undefined for this data distribution, and that PDW knew it was a biased calculation. This is a deliberate use of a biased calculation.

P value is based on a calculation to assess risk that the result is due to chance. This does not take into account risk of confounds nor risk due to biased methodology.

Argument

Calculation of p value is changed if there are multiple outcome measures and you can choose between them. In my personal opinion its a simplistic version of p-hacking if you do not make adjustments.

My source for this is S. Nassir Ghaemi's "A Clinicians Guide to Statistics and Epidemiology in Mental Health".

This makes sense when you consider p value as a probability calculation.

The Bonferroni correction is an imprecise first-pass correction. If for example there are ten outcome measures then you can correct by dividing the p value for significance by 10. This would put significance at not 0.05 but 0.005. Simply divide the 0.05 significance value by the number of measures.

There are other corrections such as the Tukey test but I know nothing about them.

As the p value for significance changes, so too does the estimation that the result is due to chance. Again, this does not correct for confounds nor bias.

This is complicated by how to treat the dropping of activity data for patients.

Action

@Carolyn Wilshire @Graham @Dolphin @dave30th @Jonathan Edwards

I am hoping someone who can still do math can look into this if it has not already been examined. Alternatively it would be good to give this to some statisticians and ask what they think. I am content to be wrong, but this needs to be examined if it has not been examined already.

Their likely counter-argument is that the alternatives are secondary outcome measures.

PS The Bonferroni correction is imprecise because I think it presumes that all measures have equal relevance. This is part of the primary versus secondary outcome measure argument.

JohnTheJack · Jun 21, 2018

Barry said:
If not lost, it sounds remarkably like it's at least been mislaid.

No, really, Barry, it's important we get this right. The raw data are safely stored and locked up in a cabinet. Nothing has been lost or mislaid or anything like that.

The question is around a legal definition only.

dave30th · Jun 21, 2018

Snow Leopard said:
They're claiming that the statisticians have moved on and it is impossible for them to get another statistician to access the data. Yes, really.

I don't think this is exactly right. They're claiming that even if they could get a statistician, which they obviously could, they shouldn't be required to go beyond what other departments would have to in order to comply, and most government units would presumably not have extra statisticians around. So to require the university to find someone else would be putting an extra burden on them under the law. That's the argument as I understood it, but I could be wrong.

JohnTheJack · Jun 21, 2018

dave30th said:
I don't think this is exactly right. They're claiming that even if they could get a statistician, which they obviously could, they shouldn't be required to go beyond what other departments would have to in order to comply, and most government units would presumably not have extra statisticians around. So to require the university to find someone else would be putting an extra burden on them under the law. That's the argument as I understood it, but I could be wrong.

I think I have now succeeded in shifting this argument. The ICO has said in her latest submission to the Tribunal that QMUL must now show either that it is impossible to provide patient-level data without the input from someone who took part in PACE, or that a QMUL statistician could provide the information but it would be against the terms of their research funding to do so.

In response the Tribunal has given us until the 4th July to provide further submissions, and declared that there is a need for a hearing to consider the case. The hearing will take place in Swansea some time between the end of August and the beginning of December.

If you want more on this, contact me privately.

Graham · Jun 21, 2018

alex3619 said:
I may have found a new hole in the PACE trial.

Sorry Alex, you over-estimate my statistical skills. At A-level there are, as you probably know, three areas (just as in Science there is Physics, Chemistry and Biology). My preferences are Mechanics and Pure, with Statistics/Decision maths limping along behind.

But I do have a concern about spending time on further detailed analysis of their errors. Their decision to ignore objective assessments and focus only on the subjective assessments that they could manipulate is, to me, enough to blow their work out of the water (back to the SS PACE analogy!). Showing that they fiddled the targets and used utterly inappropriate methods to justify it simply shows that they are either utterly incompetent or totally devious. I don't think we need to grace their sophisticated analyses with any more effort: it could distract from the key message.

Lucibee · Jun 21, 2018

alex3619 said:
I may have found a new hole in the PACE trial.

I have an MSc in Medical Statistics, if that helps. There are worse problems with the PACE trial than their significance testing. The main issue is that the although the result was statistically significant, it wasn't clinically significant. It's not about p values per se - differences in the effect estimates are more important.

But by far the biggest error is the use of self-reported subjective outcomes that can be easily biased by the intervention (re-education of how to complete a questionnaire, as @Graham has recently demonstrated). Significance is then the least of their problems.

alex3619 · Jun 21, 2018

Lucibee said:
There are worse problems with the PACE trial

Agreed. I only brought it up because I do not recall it being discussed, and its yet one more problem. The point here is that if this applies then it was not even statistically significant, even before you get to clinical signifance. There are a huge number of problems in this trial.

Recently I have been very concerned about participants reporting that their comments on being made worse were not even recorded. I heard the first one of those during the trial period. I am also keen on pursuing the deliberate misuse of SD. Most other things can be considered due to methodological problems that are common in psychiatry, or ineptitude, or some other argument. The misuse of SD shows deliberate use of biased methods. That deliberate use raises the question of scientific misconduct that I think needs to be pursued.

Significance is not as important as the methodological failings, as with confounds and bias the p value is distorted anyway. Yet the list keeps growing longer, and we have not looked hard at it from a political or social aspect, we have focused on the science in which we have stronger evidence. Yet in dealing with politicians and bureaucrats I think its worth looking harder at political issues than we have so far.

Sean · Jun 22, 2018

alex3619 said:
I am also keen on pursuing the deliberate misuse of SD. Most other things can be considered due to methodological problems that are common in psychiatry, or ineptitude, or some other argument. The misuse of SD shows deliberate use of biased methods. That deliberate use raises the question of scientific misconduct that I think needs to be pursued.

Yes, I think there is more mileage in this yet, because it exposes both deliberate misrepresentation by the authors, and poor quality peer-reviewing and editorial oversight. Plus it is also clear cut.

Carolyn Wilshire · Jun 22, 2018

alex3619 said:
Calculation of p value is changed if there are multiple outcome measures and you can choose between them. In my personal opinion its a simplistic version of p-hacking if you do not make adjustments.

My source for this is S. Nassir Ghaemi's "A Clinicians Guide to Statistics and Epidemiology in Mental Health".

This makes sense when you consider p value as a probability calculation.

Yes, I think this is a good point. It's the reason why clinical trials are supposed to have a single primary outcome measure. So that you can't take bites out of multiple cherries and then pick the juiciest one. Technically, your primary outcome should be the "gate keeper", so you only proceed to test secondary outcomes if that's significant. If researchers approach it in this way, then there is no need to correct; by requiring a p value of less than .05 as a pre-requisite, you've effectively controlled type I error not only for the primary outcome, but you've also reduced it massively for the secondary outcomes.

Although in reality, researchers will often report any secondary measures that were significant, even when the primary one wasn't, and as you say, this should be corrected in some way to account for the increased opportunities you had to get a significant result.

Of course, all this assumes that you stick to your pre-registered primary outcome - you don't switch outcomes like the PACE researchers did.

alex3619 said:
The Bonferroni correction is an imprecise first-pass correction. If for example there are ten outcome measures then you can correct by dividing the p value for significance by 10. This would put significance at not 0.05 but 0.005. Simply divide the 0.05 significance value by the number of measures.

There are other corrections such as the Tukey test but I know nothing about them.

As the p value for significance changes, so too does the estimation that the result is due to chance. Again, this does not correct for confounds nor bias.

This is complicated by how to treat the dropping of activity data for patients.

There is now a wide consensus that the bonferroni is overly conservative. This became clear when we started doing fMRI where you might need to do hundreds of thousands of tests, one for each "voxel" (minimal volume unit in the brain). You can see that using the Bonferroni correction there would preclude ever finding anything significant without massive, huge power, because you'd need p to be less than something like .000005!

Tukey is still used in practice by some researchers, but the contemporary view in the psychological research methods community is to use False Discovery rate (FDR), or if you're doing a lot of comparisons, Permutation thresholding.

Some consider FDR to be overly lax, so there is a preference now in the neurosciences for permutation thresholding. Although I consider myself maths-minded, I admit to not knowing a lot about the stats behind either of these. I think I once read up on FDR, but have forgotten all I ever learned.

Another consideration is also the purpose of the research. Correcting for multiple testing is all about striking a balance between type I errors (falsely reported an effect when it isn't there) and type II errors (failing to report an effect that is genuinely there). If the purpose of the the study is to advance knowledge, you might be willing to tolerate a bit more type 1 error so as not to miss any real genuine effects that might yield new avenues of enquiry. But if your study is to provide evidence for the efficacy of an expensive and/or risky treatment, you might want to be extra sure you have a genuine effect. In this instance, using the Bonferroni correction, although stringent, might be wise, because then you can say "There's really virtually no chance this effect is spurious".

alex3619 said:
PS The Bonferroni correction is imprecise because I think it presumes that all measures have equal relevance. This is part of the primary versus secondary outcome measure argument.

Yes, corrections for multiple comparisons are usually only employed when there are multiple comparisons done on the same dependent variable, but as you say, this is wrong, if you measure, say 20 variables, one of your comparisons is likely to come out significant by chance alone, where there is genuinely no effect. My feeling is that some sort of correction needs to be done in this situation.

In the situation where different outcome measures have different statuses (primary vs. secondary), if you want to report secondary outcome measures even when the primary outcome measure yielded no significant effect, then you should definitely correct in some way, but I'm not exactly sure off the top of my head just how.

Carolyn Wilshire · Jun 22, 2018

For anyone who just glazes over when Type I and Type II errors are mentioned, a good way to remember the difference is the story of the boy who cried wolf, because the first error (reacting to a cry of "wolf" when there was actually no wolf) is a Type I error and the second error (ignoring the boy's cry when it was actually true) is a Type II. So first error is Type one, second error is Type II.

Sean · Jun 22, 2018

Type I errors (falsely reported an effect when it isn't there), also known as a false positive.

Type II errors (failing to report an effect that is genuinely there), also known as a false negative.

janice · Jun 22, 2018

In my brain fogged and amateurish way I thought that the "lost" data bit referred to researchers wanting the objective measures data for reanalysis.
What happened to those results?

Please point me to the right place if I've missed this.

Adrian · Jun 22, 2018

dave30th said:
I don't think this is exactly right. They're claiming that even if they could get a statistician, which they obviously could, they shouldn't be required to go beyond what other departments would have to in order to comply, and most government units would presumably not have extra statisticians around. So to require the university to find someone else would be putting an extra burden on them under the law. That's the argument as I understood it, but I could be wrong.

I think at the heart of there argument is that the database has a data dictionary and it is too hard for someone with no knowledge of PACE to access the relevant fields or perform any simple statistics. If this is the case I don't see how they can share information with other researchers since they don't have the knowledge to pull out the parts that the researchers are interested in and if it requires detailed knowledge of PACE to pull out the data I assume the receiving academics may also not be able to do so in a reliable manner. To me their argument is basically saying they didn't do enough documentation in their data dictionary to catalog the database (but I don't believe this).

So I think in effect they are either misleading Walport and hence the commons committee or they are misleading the ICO.

I guess there is another issue around the availability of statisticians but I don't believe that they have no non-contract based staff who cannot access a database or perform simple stats. I would be very surprised if there computer science dept do not teach data science and hence have not staff who could do this very easily.

Andy · Jun 22, 2018

janice said:
In my brain fogged and amateurish way I thought that the "lost" data bit referred to researchers wanting the objective measures data for reanalysis.
What happened to those results?

Please point me to the right place if I've missed this.

This gives an overview of the events that lead to the release of part of the data from PACE, http://me-pedia.org/wiki/PACE_trial#Release_of_Data - note that the complete trial data wasn't requested, and therefore only what was requested was released. If you read into the following section it gives links to the reanalysis itself.

Lucibee · Jun 22, 2018

alex3619 said:
The point here is that if this applies then it was not even statistically significant, even before you get to clinical signifance.

I don't doubt that what they did was statistically significant, but if what they did was incorrect and biased, then it's how that result is interpreted that's important. No amount of reanalysis or fancy statistical adjustment is going to make any meaningful difference to that.

janice · Jun 22, 2018

Andy said:
This gives an overview of the events that lead to the release of part of the data from PACE, http://me-pedia.org/wiki/PACE_trial#Release_of_Data - note that the complete trial data wasn't requested, and therefore only what was requested was released. If you read into the following section it gives links to the reanalysis itself.

Brilliant.

When I'm not distracted by THE Westminster Hall debate (which I feel I can watch having read the thread on S4ME = brilliant too) I will certainly have a look at this.

Thank you for pointing me in the right direction.

A general thread on the PACE trial!

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Guest

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Administrator

Senior Member (Voting rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)