Effectiveness of a symptom-clinic intervention ... multiple and persistent physical symptoms, 2024, Burton, Deary et al

Discussion in 'Other psychosomatic news and research' started by rvallee, Jan 17, 2021.

  1. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    Diagnosis
    See the post upthread for the list of questions in the PHQ-15. I note that women have to answer an additional question about periods, and I think most menstruating women would answer at least 'bothered a little' about period symptoms. So, the PHG-15 and the diagnosis approach makes it more likely that women will be included in the sample.
    Edit - later in the paper they say they adjusted scores when people left that line and/or the one about sexual activity blank.



    So, they were looking for people who had a code for a physical symptoms syndrome, had been referred to specialists at least twice in the preceding 3 or so years, but did not have a medical diagnosis thought to cause symptoms. That leaves a great gaping hole for people with persistent symptoms with a biological cause that has not yet been identified. And these researchers thought it was ethical to instruct doctors to 'negotiate' a made up explanation basically amounting to 'your brain is wrong, it got muddled up somewhere along the way, and so you only think you are unwell, but actually you are fine, so, get on with things'.


    GPs were able to veto someone being invited into the study. I'd be very interested to see the gender split on the veto'd versus non veto'd.


    Depending on what the patient information sheet said, there may have been some substantial self-selection. It would be interesting to know how the intervention was described.
     
    Last edited: Jun 15, 2024
    alktipping, rvallee, Lucibee and 6 others like this.
  2. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    Randomisation; masking
    A big deal is made of randomised allocation of people to the treatments. But that is rather insignificant when the participants knew whether they were allocated to the active treatment or 'nothing'.

    Just noting that quote about some interviews of participants they did - the trial did include people whose primary symptom was fatigue. I guess if ME/CFS isn't diagnosed, you don't have to follow the NICE ME/CFS Guideline.

    They audio recorded all the symptom clinic consultations. That will be a gold-mine as they refine techniques to convince people their symptoms can safely be ignored, and if ignored well enough, will go away.

    A lot of measures were taken - it will be interesting to see if results for all of this are reported on:
     
  3. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    It looks as though harms were reported by participants, but I assume that if the participants stopped participating and weren't contactable, then the harm wouldn't be reported. There is no indication that medical records were checked during that following year or that there was any feedback from the participant's own GP, and health care use was only gathered by self-report from the participants who remained around to still be participating.

    I always get a bit worried when investigators start messing around with the data. Does this mean 'Oh, this doctor didn't seem to have such a good impact on improvements as all the other doctors, so we will improve the scores of his patients a bit'? Or, older people didn't seem to improve as much as younger people, and our treatment sample had more older people than the control sample, so we'll just improve the results of all old people. It seems like a black box of opportunities for data manipulation in the direction you want. I think the onus should be on researchers to make the control group a good match so adjustments aren't needed. And I don't understand the rationale for adjusting for the impact of the individual doctors 'treating' the participants.

    Ugh
     
    Last edited: Jun 15, 2024
  4. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    7837 people identified from GP records
    891 excluded or not sent a participation pack - mostly due to the GP saying they aren't suitable e.g. because their symptoms probably are due to a real disease, probably also some GPs would know some patients would not be receptive

    6946 sent the participation pack
    5688 did not return the pack!!
    1258 assessed for eligibility

    904 were excluded, due to not being eligible or not wanting to proceed

    354 randomly assigned to the treatment or control. That's about 5% of both the people identified from GP records and of the people sent a participation pack. There is such a selection bias there. If this treatment turned out to be wildly successful in this patient group, that selection bias would need to be borne in mind - this trial would not be evidence of the intervention being successful in the vast majority of cases.
     
  5. Nightsong

    Nightsong Senior Member (Voting Rights)

    Messages:
    229
    This reference is to Nimnuan et al. Medically unexplained symptoms: an epidemiological study in seven specialties; I had a brief skim of this paper. Initial thoughts:

    MUS definition was based on absence of a "conventional biomedical explanation" 3 months after routine examination & investigations. HADS to detect anxiety/depression; no structured interviews. They not only did not find associations between MUS and depression/anxiety (as measured by HADS) or work status which seems to contradict a great many other psychosomatic papers and they do not really address why physical but not psychological attributions would not be similarly associated. The sample size and response rates were lower than expected. Some results achieve statistical significance but look like weak associations or with wide CIs; no correction for multiple comparisons. The %MUS figures seem high and there is also a high variability: the highest in gynaecology (66%) and the lowest in dental (37%). 45% in rheumatology, 53% in cardiology, 58% in gastroenterology, 62% in neurology - these seem very high. Additionally the authors developed novel tools for symptom review and illness cognition but there was no detailed discussion of their psychometric properties.
     
    Last edited: Jun 15, 2024
  6. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    Of the 354, 176 were assigned to usual care and 178 to the intervention
    135 assigned to usual care made it to 52 weeks; 144 in the intervention made it to 52 weeks.

    The numbers used in the Intention to Treat Analysis were those still present at 52 weeks, so did not include people who withdrew or were lost to followup. Numbers who withdrew and were lost to followup were similar in both groups and not very large; more withdrew in the control group.
    3 of the people in the control group did not have baseline data; presumably that was imputed from their 3 month data.


    That's interesting. So, one GP dropped out with no reason given and had seen some participants, and there is nothing said about what happened to their patients and the data.

    As expected.

    All the better to (tr)eat you... (reference to the wolf in red riding hood)
     
    Last edited: Jun 15, 2024
  7. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    Well yes, and even then 'conceivably' is doing a lot of heavy lifting. As @rvallee would say, the giant spaghetti monster in the sky is conceivable, it doesn't mean it is likely.

    As Nightsong said the Toussaint paper suggested the standard error of measurement was 2.3, which seemed to be suggesting that that was an estimate of a minimal clinically important difference:
    And this paper found an improvement of 1.82. So, that's not actually clinically important. But they argued that because the 95% confidence interval of -2.67 to -0.97 includes the minimal clinically important difference, they could still say their intervention helped.

    So, I would say 'in a subjective measure in an unblinded study of highly selected individuals that cannot be assumed to be representative of the disease population, comparing a group that got something to a group that got nothing, and with some dropouts and who knows what they did with their model "adjustments", even then, they did not achieve the bare minimum that could be conceivably called clinically important.'
     
    Last edited: Jun 15, 2024
    alktipping, Eleanor, rvallee and 6 others like this.
  8. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    I think, just assuming a normal distribution with that mean and that 95% confidence interval, that there is a 13% chance that the true mean is the supposed clinically important difference of -2.3 or more. Which makes sense if you look at the reported 95% confidence interval of -2.67 to -0.97. -2.3 is well to the end of the range. A question for people with better statistics knowledge than me, (ignoring all the other problems with this paper and just looking at the maths), can the researchers credibly claim that this study shows that the intervention achieves a minimal clinically important difference?

    If you set that along side the harms that they have not properly assessed and the cost of providing the intervention, I think there is a lot of smoke and mirrors and no real benefit.

    The authors seem to be suggesting that, for the people it works for, it works really well. I haven't seen the data on that yet.
     
  9. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    Results

    There were people who didn't attend the initial session or stopped attending the sessions, and that will probably have affected the results, but the rates of participation (for the 4 sessions) aren't too bad.
    No statistically significant difference at 13 weeks or 26 weeks.

    Table 2 gives the means for a range of other measures at 52 weeks, for the two groups. It is concerning that the numbers of participants varies quite a lot for each measure - missing data creates opportunities for inconvenient results to be lost. There is no analysis of statistical significance or minimal clinically important difference. There is no comparison with baseline measures. There is no discussion of these other measures in the results section at all.

    Here's one: PGIC Patient Global Impression of Change
    Usual care 1.8
    Intervention 3.2
    It therefore looks as if the usual care people were reporting that their health had, on average, 'much improved' tending to 'very much improved, while the intervention people reported that their health had, on average, 'minimally improved' tending to 'no change'. I find that hard to believe, and I wonder if data got entered in the wrong columns. But, it's either not reflecting well on the utility of the intervention or it's not reflecting well on the reliability of the paper.

    I'd appreciate it if someone could check that PGIC finding. It might be worth poking around in the other data there.


    A very low percentage of participants reported their health-care data. I haven't looked at the differences between group, but the authors are suggesting that the differences are minor.

    On adverse events - some of the participants turned out to have medical conditions that may have explained their 'persistent physical symptoms'.
     
  10. Lucibee

    Lucibee Senior Member (Voting Rights)

    Messages:
    1,494
    Location:
    Mid-Wales
    It found a *between-group* difference of 1.82.

    The median scores at baseline in both groups were 15 - but also need to consider that entry criterion was a score between 10 and 20. They excluded those with high (20-30) or low (0-10) scores.
    This improved (on average) at 52 weeks to 14 in the usual-care group (1 pt diff), and to 12 in the ignore-your-symptoms group (3 pts diff).

    Studies like this should be banned.
     
    Mithriel, Sean, Robert 1973 and 7 others like this.
  11. Lucibee

    Lucibee Senior Member (Voting Rights)

    Messages:
    1,494
    Location:
    Mid-Wales
    This makes me so cross!

    What I'd like them to have done is to maybe do a trial where they use their "intervention" as the placebo control, and do some proper work-up and investigation in a true intervention group.

    But the problem is always going to be, unless they actually find out what is wrong and treat it, there will likely still be little difference between the two groups.

    It's a very sorry state of affairs. :(
     
    Mithriel, Sean, alktipping and 3 others like this.
  12. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    yes, sorry if it wasn't clear. Improved relative to the outcome in the 'do nothing' group.
     
  13. Lucibee

    Lucibee Senior Member (Voting Rights)

    Messages:
    1,494
    Location:
    Mid-Wales
    In the appendix (pdf link: https://www.thelancet.com/cms/10.10...92cea412-1aa4-4f3b-82d5-0b3e51d62615/mmc1.pdf), they have the scale the other way around: "0= No change or condition has got worse; 7= A great deal better"

    But the form itself is very confusing, because the two scales seem contradictory.

    It looks like they've reported the top half of the scale, and not the bottom likert bit.

    tbf - that also seems to match the data they got from the PHQ-15 pretty well (as can be seen in the full results plot).
     

    Attached Files:

    Last edited: Jun 15, 2024
  14. rvallee

    rvallee Senior Member (Voting Rights)

    Messages:
    12,919
    Location:
    Canada
    All things considered, this looks a lot like the FINE trial. Definitely, trials like this should be banned. They have zero legitimate value and make a mockery of science.

    I tried to find what they mean by negotiating symptoms. There's nothing about it. About explaining symptoms, the model is supposed to be shown in Figure 1, but that graphic is literally just a dot. Which is oddly fitting. I guess they forgot to add it.

    This reads like all the junk techno buzzword you hear from business suits who understand nothing of the underlying technology but are used to BS their way around marketingspeech:
    The skills of "holism", which is obviously not a skill, and whatever the hell generalism means, it clearly doesn't apply to making stuff up and convincing people of things that are false. One interpretation would be of general intelligence, but this isn't it at all. In fact this makes a mockery of true general intelligence, which scoffs at small-minded nonsense like this. This is shady salespeople techniques, nothing more.

    Isn't statistically significant but not medically significant supposed to not matter? In their conclusions they report that it's only statistically significant and further research should, as is tradition, evaluate cost-effectiveness. But it's not clinically significant, a bar that is clearly set way too low anyway, so it can't be cost-effective, by definition.

    From a cited paper on what they mean by a "rational" explanation, another Burton paper:
    And of course that's total BS. They're sort of "elevating themselves by their own bootstraps" here.

    What decades of never having to make sense does to people's reasoning skills. Even medieval alchemists would find them pedestrian.
     
  15. Lucibee

    Lucibee Senior Member (Voting Rights)

    Messages:
    1,494
    Location:
    Mid-Wales
    £1.5million spent on this (https://www.dev.fundingawards.nihr.ac.uk/award/15/136/07)

    This is the protocol paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9668014/

    But I'm having trouble finding out exactly what "The Symptoms Clinic" does in the 90-110 minutes it has of a patient's attention.

    The two cited studies for the Symptoms Clinic in the protocol are these:

    1. Morton L, Elliott A, Thomas R, Cleland J, Deary A, Burton C. Developmental study of treatment fidelity, safety and acceptability of a symptoms clinic intervention delivered by general practitioners to patients with multiple medically unexplained symptoms. J Psychosom Res 2016;84:37–43. 10.1016/j.jpsychores.2016.03.008 [PubMed] https://pubmed.ncbi.nlm.nih.gov/27095157/

    2. Burton C, Weller D, Marsden W, Worth A, Sharpe M. A primary care symptoms clinic for patients with medically unexplained symptoms: pilot randomised trial. BMJ Open 2012;2:e000513. 10.1136/bmjopen-2011-000513 [PMC free article] [PubMed] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3330253/

    The most they provide is this:
    Although I suspect this study provides more details (ref 19 in the main paper): https://pubmed.ncbi.nlm.nih.gov/37441925/ ... eta - yes it does - full open access version here: https://www.sciencedirect.com/science/article/pii/S0738399123002501?via=ihub
     
    Last edited: Jun 15, 2024
  16. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    Thanks Lucibee.

    It looks as though they have completely changed the way the PGIC is scored, as well as using two different methods of collecting scores. And even then, the intervention doesn't come out well. Bear with me, it's confusing, and I hope I haven't confused myself. But, if I haven't, then it confirms my impression of the work of these people - it would be a joke if it didn't have an impact on people's lives.

    PGIC - Patient Global Impression of Change - so it's an important measure

    Here's the approach I found when googling PGIC:
    And that's fair enough, there is the full range from very much improved, right through to very much worse. A low value is good.


    But the top part of the PGIC that is reported in this study's appendix, as copied in Lucibee's post above, reverses the order:

    Screen Shot 2024-06-16 at 8.15.32 am.png

    But look - the range is essentially only from 'no change' to 'wonderful'. And, they don't say how that scale is scored. I reckon just that scale alone sums up all the problems with the psychosomatic medicine proponents. But, incredibly, these authors have managed to add layers of obfuscation and confusion to that.



    The second scale in the PGIC scale appendix does have the full range:

    Screen Shot 2024-06-16 at 8.20.11 am.png
    But that is a 10 point scale, and it runs from better to worse, so lower numbers are better.


    It's all a mess - we don't know what scale was actually used.

    To remind ourselves of the numbers that were reported in this study for PGIC:
    Usual care 1.8
    Intervention 3.2

    So, if the bottom scale in the appendix was used, the average outcome reported by the usual care people was better than the average outcome reported by the intervention people. !!

    If the top scale in the appendix was used and a logical scoring symptom running from low to high was used, from 1 to 7, the average intervention score of 3.2 is still 'A little better, but no noticeable change'.
     

    Attached Files:

    Theresa, alktipping and Trish like this.
  17. Trish

    Trish Moderator Staff Member

    Messages:
    53,396
    Location:
    UK
    Maybe write to the authors and ask how the figure was calculate. They shouldn't be reporting a figure without the clear source.
     
  18. Trish

    Trish Moderator Staff Member

    Messages:
    53,396
    Location:
    UK
    How can something be 'a little better but no noticeable change'? If the change isn't noticeable, how do they know it's a little better?
    That scale that only allows one negative/no change and lots of variations of positive should be illegal. It's not a research instrument, it's the cheats version of market research.
     
  19. Lucibee

    Lucibee Senior Member (Voting Rights)

    Messages:
    1,494
    Location:
    Mid-Wales
    @Hutan It looks like they've used the upper portion of the scale - which should be graded 1-7.

    Table 9.2 has a typo - saying that it scores 0 to 7 in the right hand column. But it's clear from the questionnaire itself that 1 should be "No change (or worse)" and 7 = "A great deal better" because there are only 7 items.

    If table 6 is correct, the intervention can only be superior if they've used this upper scoring method. So the bottom scale is superfluous.

    That's fine. It agrees with graph of the scores for PHQ-15.

    So on that scale, 1.8=Almost the same, hardly any change (usual care), and 3.2=A little better, no noticeable change (intervention).

    But they should have said so in the main paper that this was the case, because it looks a lot less rosy.
     

    Attached Files:

    Trish and Hutan like this.
  20. Hutan

    Hutan Moderator Staff Member

    Messages:
    27,828
    Location:
    Aotearoa New Zealand
    Yes, that looks right.

    [​IMG]

    There are different versions of the PGIC. I don't understand how any scientist could look at the version these authors used (the version above) that so obviously minimises declining health and any harm from the intervention and has vague, difficult to differentiate descriptions and think that it was ok to use. It suggests that all of the literature using the PGIC is suspect until proven otherwise.

    P.S. even the instruction on the questionnaire serves to underline to the people in the 'care as usual' group that they didn't get care at the special clinic, and so leads them to a default of 'no change'.
     
    Last edited: Jun 15, 2024
    Lucibee and Trish like this.

Share This Page