Preprint Replicated blood-based biomarkers for Myalgic Encephalomyelitis not explicable by inactivity, 2024, Beentjes, Ponting et al

Discussion in 'ME/CFS research' started by Andy, Aug 28, 2024.

  1. Hutan

    Hutan Moderator Staff Member

    Messages:
    29,101
    Location:
    Aotearoa New Zealand
    I agree with most of this. The paper does need reworking. As you say, we need graphs of actual values (with separate charts for male/female, and adjustments for age if appropriate).

    I don't have a background in advanced statistics, so I say the following with much hesitation and willingness to be corrected, but, there are statistical techniques useful for identifying groups and subgroups based on clusters of attributes, things like PCA, random forest. If there are any promising clusters based on differences in individual molecule levels (and perhaps ratios of molecules), then it would be useful to see if a lack of activity can explain the differences. If physical activity does explain some attribute levels, those ones could be removed and classifying statistical techniques like PCA repeated, to see if any identified groups remain. I think we need that order of analysis - identify differences that are probably real differences in an identified group of individuals, and then consider if any of the differences are just the result lifestyle/medicine differences.

    A family member did an analysis of a large collection of plants, looking to see if there were sufficient differences to divide the plants into two species, or just subspecies, or no division at all. I mention this, because I find it a useful, concrete way to think about how a differentiating analysis can be done. He had a whole lot of measured attributes including things like flower structure, leaf length and leaf width, as well as ratios of leaf length to leaf width. The classifying analyses showed that the plants divided neatly into two distinct groups. He then needed to consider if things like altitude or latitude could explain the differences, using regressions and spatial analysis, and found there was still a real difference. He then went on to speculate on the biological basis for the collection of attributes in each proposed species. I think the same sort of approach to the analysis, the same order of analysis, could be useful for studies like the one that is the subject of this thread, with physical activity being a possible confounder, like altitude might have been for the differences of my family member's plants.

    Regarding the moved posts, we left enough to indicate that the problem had been raised in the context of this study and include some discussion of it. There were 27 posts that went into more detail or repeated points that were moved. I think SOD3 is another good example of where it is useful to split off the discussion. Personally, I would prefer that there is a separate thread for SOD3, rather than having a wide-ranging discussion of it on a study thread that covers many other things. That way, it is easier to go into much more detail and it is easier for people to find and follow the discussion. Threads on studies that have reported findings in ME/CFS cohorts can be linked.
     
    hotblack, sebaaa, alktipping and 8 others like this.
  2. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    15,003
    Location:
    London, UK
    I don't fully understand mediation analysis but my guess is that it hinges on a correlation. If physical activity correlates to the physiologic measures found then one possibility is that these measures indicate part of the cause of the inactivity - via a process we think o f as the cause of ME/CFS. So maybe they shouldn't be thrown out. One of the things that worries me about the paper is that all causes end up at the blood tests. But the blood tests are only going to turn out to be interesting if they reflect causes of the syndrome, which is not an option on the diagram if I remember rightly.
     
    Last edited: Sep 12, 2024
    hotblack, Ash, alktipping and 4 others like this.
  3. Hutan

    Hutan Moderator Staff Member

    Messages:
    29,101
    Location:
    Aotearoa New Zealand
    A correlation between physical activity and a molecule doesn't prove much. As you say, the levels of the molecule might be an indication of disease, and the disease might be causing inactivity. But, if the 'ME/CFS' group and the healthy group both have the same relationship between levels of the molecule and activity levels (in terms of a regression line), and the levels of the molecule in the two groups (and indeed the levels of activity) overlap to a significant degree, then I think the molecule doesn't have much use for classification.
     
    hotblack, alktipping, voner and 5 others like this.
  4. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    15,003
    Location:
    London, UK
    That's an important point but if the molecule was a mediator of fatigue that synergised with some other factor in ME/CFS maybe you would get the same regression line.

    Another good point. I think it reflects another general concern that I have that tests that help us identify groups with common natural history (often called 'diagnostic'!) nearly always fall outside two standard deviations from the mean in a good proportion of probands. There are some exceptions but mostly where there is an unbroken spectrum of risk-bearing factors - like lipids or blood pressure. There are also some oddities like urate but they are few.

    If you rely on shifts or correlations in measures within the normal range and require combinations picked out by machine learning etc. then increasingly you put all your eggs in a statistical basket. There may be situations where that works. But if you are doing it to try to make a diagnostic category crisper then you have a Catch22 if you use the fuzzy clinical categorisation to do your statistics on.

    My guess is that there is a risk that anything up to 80% of patients picked up on the screening criteria may not have ME/CFS as we like to understand it. ME/CFS is not that common - maybe one person in 500. As many as one person in fifty in the 40-70 age group is likely to acquire a diagnosis like ME or CFS simply because it is easier for the GP to provide an explanation for low grade poor health that does not fit another category. Worse still those 89% will bring with them all sorts of shifts in physiology that will confound a purely statistical approach to discriminatory tests.
     
    sebaaa, alktipping, Hutan and 10 others like this.
  5. DMissa

    DMissa Senior Member (Voting Rights)

    Messages:
    139
    Location:
    Australia
    How does one resolve such ascertainment problems?

    Are you saying that current case criteria are unreliable, or that the rigor of their application in cohort selection is lacking? (or are you saying both?)

    If it is the former, how does one perform meaningful research? Would this not also confound mechanistic studies?

    Are you suggesting instead a detailed case study approach to generate testable hypotheses? (eg: what seemed to happen with Paul Hwang's WASF3 work.) Or no?

    With this in mind please interpret my foregoing questions through the following lens: I am a young (29 year old) scientist whose strongest calling is to make his ME/CFS research rigorous and meaningful. I am earnestly here to listen to your experience.
     
    Last edited: Sep 13, 2024
    Ash, ME/CFS Skeptic, sebaaa and 13 others like this.
  6. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    15,003
    Location:
    London, UK
    Thanks for the questions. It is complicated but I will try to give some answers.

    I guess that you cannot so much 'resolve' the ascertainment problems as work knowing what they are. Ascertainment of diagnosis of RA was always uncertain, although not as bad as ME/CFS, but it didn't get in the way of working out how to treat people.

    If you compare a cohort of barn-door typical cases with healthy controls it doesn't matter very much what your case criteria are. And no one set can be claimed to be better than another. I don't think I ever knew the case criteria for RA, any more than I know them for a dog.

    The problem comes if 'grey area' cases affect the data. And grey area cases may be commoner than barn door, both for RA and ME/CFS. Are you fatigued more than half the time? The answer to that can depend on all sorts of things. Even something like joint swelling judged by a physician will vary greatly in grey area cases. My juniors were often sceptical about subtle joint swelling that I could recognise because I had seen it just like that for years. They might also score swelling that I knew was not due to the joint.

    This particular study relies on reports that are likely to pick up grey area cases of ME/CFS. The study was well worth doing because it made use of a potentially valuable large data resource, but I think it illustrates just how complex confounding factors might be. To give an idea of the problem, let's take RA, fibromyalgia and ME/CFS. In RA for every 1000 people there will be maybe 5 barn door cases and maybe 5 cases who really do have the pathology of RA (which has been reasonably defined for 100 years in terms of chronic lymphocytic infiltration of synovia) but for whom being sure of that is very hard. We even have defined categories of 'probable' and 'possible' RA and we don't include these in studies. There will also be maybe 10-20 cases of people who have been told they have RA at some time but don't. And that is the easy situation. For fibromyalgia studies have suggested that physicians vary up to a hundred fold in their rate of making the diagnosis. That is the worst case. I see ME/CFS as likely in between.

    You might say that none of this is too disastrous because if grey area cases are misdiagnosed it will just blunt the statistical significance figures a bit. An important correlation will still show through. But if you factor in the way probands are recruited things get trickier. If you recruit from clinics there may be all sorts of spurious differences between people who come to clinics and people who do not. If you go to a population asking for volunteers you might do better but you still have people acquiring diagnoses by going to a doctor. And if they go to a doctor for other reasons too that may affect how likely they are to get an ME/CFS diagnosis.

    On the forum we constantly moan about doctors diagnosing chronic fatigue too widely. Most doctors aren't bothered whether they are diagnosing chronic fatigue NOS or PEM-carrying ME/CFS. The names they use will probably reflect their choice of shirt colours and voting behaviour.

    For the great majority of diseases none of this really matters in the end because the clues to disease mechanism and treatment stick out like a sore thumb when you find them. When someone finally bothered to look at smoking and risk of RA the association was barn door. But if you are trawling for statistically significant differences in a large population within the normal range I think the chances of getting spurious associations are 100%.

    Genetic studies are much easier because correlation to gene alleles pretty much have to be causal. The causation can be indirect and obscure but it is still causal. I had concerns about using internet calls for recruiting for DecodeME but I now see that those concerns are probably not too great. Gene alleles for metabolic pathways or immune functions are pretty unlikely to be commoner in people who volunteer for research. But spurious associations between volunteering for a Biobank and obtaining an ME or CFS diagnosis are much more plausible, both being behavioural traits relating to one's health status.
     
    Jacob Richter, Ash, sebaaa and 14 others like this.
  7. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    15,003
    Location:
    London, UK
    And also, I may have got this point wrong in relation to this particular study and I would like anyone who can point that out to me to be able to do so.
     
    Ash, alktipping, DMissa and 5 others like this.
  8. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    15,003
    Location:
    London, UK
    To be a bit clearer, what I think we need are tests that indicate, however indirectly, however much just a hint of where to look, something about what is going wrong in an illness like RA or ME/CFS. Those test may seem like 'diagnostic tests' and maybe you could call them that at times but if you conceive of their purpose as diagnostic tests, as a scientist, you are likely to start chasing things that don't exist.

    It would probably be even more reasonable often to call these tests 'biomarkers' but the discussion makes clear that nobody is quite sure what that term is supposed to mean and I have never found it useful. By and large I think it is almost as misleading as diagnostic test.
     
    Ash, sebaaa, alktipping and 8 others like this.
  9. Adrian

    Adrian Administrator Staff Member

    Messages:
    6,547
    Location:
    UK
    TSNE is a good mechanism for visualizing multidimensional data: t-distributed stochastic neighbor embedding - Wikipedia
     
    Kitty, Hutan and Michelle like this.
  10. forestglip

    forestglip Senior Member (Voting Rights)

    Messages:
    799
    @Chris Ponting I think the 5% recovery rate is inaccurate:
    It seems to be a mistake in the abstract of the cited source, while the correct median study recovery rate is in the full text, at 7%. See here.
     
  11. Trish

    Trish Moderator Staff Member

    Messages:
    55,073
    Location:
    UK
    Moderator note

    We have read the discussion of whether posts should be moved and taken note of points made. We have decided to leave most of the discussion even though it is a diversion from the research topic and breaches Rule 9: Specific moderation decisions should not be discussed publicly. A few posts restarting the discussion and repeating points already made have been deleted.

    Please in future, if you think moderators have made a mistake in moving your post on this or another thread, use the contact moderators button or contact a moderator privately to discuss it.
     
    Ash, sebaaa, alktipping and 18 others like this.
  12. Hutan

    Hutan Moderator Staff Member

    Messages:
    29,101
    Location:
    Aotearoa New Zealand
    Ash, Kitty, shak8 and 4 others like this.
  13. hotblack

    hotblack Senior Member (Voting Rights)

    Messages:
    276
    Location:
    UK
    Trish, sebaaa, Jacob Richter and 11 others like this.
  14. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    15,003
    Location:
    London, UK
    Chris said he was going to post a new version I haven't had a chance to look at it yet. It should be worth going through in detail.
     
    Trish, Jacob Richter, bobbler and 4 others like this.
  15. Hutan

    Hutan Moderator Staff Member

    Messages:
    29,101
    Location:
    Aotearoa New Zealand
    New abstract (paragraphs added by me for improved readability)

    The only change from version 1 in the abstract is the use of ME/CFS as the disease name.

    There's just this bit that I think still needs more work
    They say that ME/CFS status had a significant effect on only one trait, and then they say that, by contrast, ME/CFS had a significant direct effect on 290 traits.

    Perhaps they meant to say something like that 'ME/CFS status had a significant indirect effect via a mediator related to inactivity on only one trait'? I could be wrong about what they are meaning to say, as I haven't read the new paper, but I don't think the two statements as they are written can be true.

    I made a comment about this before - am I misunderstanding something?
     
    Last edited: Sep 28, 2024
    Kitty, bobbler, MeSci and 5 others like this.
  16. Kitty

    Kitty Senior Member (Voting Rights)

    Messages:
    6,680
    Location:
    UK
    Yes, it does come across oddly.

    I read it as trying to emphasise that ME/CFS status had a significant effect on only one activity-related trait (the Duration of Walk one), whereas there were direct effects on 290 non-activity related traits.

    It doesn't say that at all, so I may be wrong, but I can see why they might want to make that kind of point.
     
    bobbler, hotblack, Sean and 1 other person like this.
  17. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    15,003
    Location:
    London, UK
    My guess is that 'indirect' should be in there. That is how I understand it.
     
    Last edited: Oct 1, 2024
    Kitty, hotblack, Sean and 1 other person like this.
  18. mariovitali

    mariovitali Senior Member (Voting Rights)

    Messages:
    514
    @Chris Ponting If you are reading this, please look at the following thread on Twitter. I use a proprietary software framework since 2017 and by using the differentially expressed proteins mentioned on your paper, transferases receive a very high relevance ranking. The thread can be found here :

    https://twitter.com/user/status/1841060345017721278
     
    Sean, forestglip and hotblack like this.
  19. hotblack

    hotblack Senior Member (Voting Rights)

    Messages:
    276
    Location:
    UK
    Kitty, Sean, Yan and 3 others like this.
  20. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    15,003
    Location:
    London, UK
    I have been looking at the new version. The introduction is much crisper. I find the mathematical modelling hard to relate to the biology I am familiar with but I think I have a reasonable idea what the overall picture is.

    Various people have picked out particular proteins they think are interesting. It is good to see that these are independent of inactivity and Chris assures me that the BMI levels are not signficantly different. For me the real prize would be if one or more of these proteins matched a signal on DecodeME in some way.
     
    sebaaa, Robert 1973, MrMagoo and 13 others like this.

Share This Page