MECFS data analysis thread

Discussion in 'Other research methodology topics' started by Murph, Mar 22, 2024.

  1. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    147
    I'm keen to see if we can trawl existing data to find patterns.

    There are enough small and medium-sized studies out there - could real findings emerge from combining them?

    Tis thread is for that purpose and I'd like it to include:
    • A list of data that's available: metabolomics, proteomics, lipidomics, cytokines etc.
    • Ways to access that data in formats that are easy to analyse.
    • links to ways to analyse that data, e.g code and spreadsheets. (I use R and it's a powerful way for people to do and share analysis, although harder to learn than excel.)
    • outputs of that analysis.
    A couple of starting projects to pursue.
    1. Cross-checking the findings in the NIH study against existing studies to see if any existing findings are reinforced / become disputed or if new signals emerge.
    2. Seeing if there's a metabolomic signature of endoplasmic reticulum stress and figuring out if it's present in any prior studies; in order to sense-check Hwang's WASF3 paper.
    3. Combining old metabolomics studies to see where they agree.

    This would I hope be a collective endeavour so please suggest ideas that could be worth looking at!
     
    Simon M, Nightsong, horton6 and 16 others like this.
  2. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    147
    Peter Trewhitt, Amw66, Kitty and 8 others like this.
  3. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    15,175
    Location:
    London, UK
    I admire your enthusiasm, Murph but I think the answer is no.

    Having spent a lifetime chasing clues in pathogenesis in chronic disease I know that the clues that matter tend to stick out like a sore thumb. Small statistical differences, which is all we have ever seen for anything in ME, aren't going to be the answer. And trawling through studies looking for similarities is very much post-hoc and largely statistically uninterpretable.
     
  4. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    147
    I'd like to shout out to a couple of inspirational projects for this kind of meta-analysis.

    Brydges, Che Lipkin &Fiehn 2023
    and
    Kaczmarek 2023.

    The former does a Bayesian analysis on 3 metabolomics papers and finds peroxisomes and prostaglandins stand out as important.

    The latter does a pathway analysis on four miRna studies and finds VEGFA to be most central. This second paper has an especially impressive aspect because the author is a patient, (who some people may have encountered as necessary8 in other forums).

    Patients can't necessarily do expensive lab work but we can bring brainpower to bear on previous studies and make sure what is known is properly understood and exploited.

    Another possible way this endeavour could add value - maybe we will find inconsistencies or data errors in some findings and realise the whole field should approach those findings with extra care.
     
  5. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    147
    One possible thing we may be able to point out is that untargeted metabolomics delivers no consistency in findings and isn't worth pursuing. In fact noticing that lit a fire under my curiosity. I'm pretty shocked at how little correspondence I find between different metabolomic studies. For example, the correspondence between these two studies by Hanson is low.
    hansonvshanson.jpeg

    But how does it look if we cut it by sex or only look at the significant findings / tending significant findings? I want to know!
     
  6. mariovitali

    mariovitali Senior Member (Voting Rights)

    Messages:
    516
    This is a very important thread, Thank you very much @Murph . When it comes to any kind of data analysis, we have to establish the ground truth and this has been a big problem since I started exploring what could be responsible for ME/CFS. I strongly believe that data analysis and AI will help us understand what is going on. For the record, certain analytical methods were able to identify what conventional research has found by a median of 5.5 years earlier, as seen below. References are available.

    Screen Shot 2024-03-22 at 12.43.23.png
     
    Peter Trewhitt, Amw66, Murph and 6 others like this.
  7. CRG

    CRG Senior Member (Voting Rights)

    Messages:
    1,860
    Location:
    UK
    If ME/CFS is a single pathological entity then maybe what you suggest could identify things that have been missed - but to me that intitial 'IF' is a problem. IMO it is likely that what we recognise as ME/CFS in any individual is the outcome of more than one pathological process (both cascade and discrete) which in combination do not map across the patient population. If that is the case then any signatures in small studies are likely to evapourate into Simpson's Paradox. I've suggested one way of looking at the problem here: An analogy to explain ME/CFS complexity: Three body problem

    It's to be hoped that the DECODE ME study will have had a large enough patient cohort to at least identify some locations of concern, but even with circa 20k cohort I think we are going to be lucky to get definitive targets.
     
  8. Jonathan Edwards

    Jonathan Edwards Senior Member (Voting Rights)

    Messages:
    15,175
    Location:
    London, UK
    I think we actually know that already.

    And even if a slight shift in metabolism shows up on several studies it is more likely to be some downstream effect of what sort of activity PWME undertake or what pills they swallow. Unravelling mechanisms isn't usually like squeezing blood out of a stone. It is more like noticing that a ton of bricks landed on a doorstep. The hard part is just knowing which doorstep to check.
     
  9. chillier

    chillier Senior Member (Voting Rights)

    Messages:
    237
    Is this comparing the significant metabolites from at least one study? Or is it comparing all of the detected metabolites regardless of whether they were found to be different?

    It looks like most of the metabolites are hovering in a bubble around a fold change of one which suggests that the two studies mostly agree about most metabolites having no change between ME/HC. Do the significant ones agree?
     
    Simon M, Peter Trewhitt and Kitty like this.
  10. chillier

    chillier Senior Member (Voting Rights)

    Messages:
    237
    This makes sense about the metabolomics done at baseline but I think the two studies Hanson's group did across 4 time points over two CPETs are interesting.

    The blood plasma one argued changes in ME of metabolites relating to to urea cycle/glutamate/arginine/proline metabolism following exercise that was not seen in HCs. (and also one of the most significant metabolites of unknown identity later being discovered in an independent screen associated with glutamate)

    The urine one argued huge changes in what also look to me like urea cycle metabolism in HC following exercise but not ME. It was small (n=I think 11) and questions about deconditioning etc remain but it's definitely a sore thumb.

    These could both come to nothing with replication but I think replication is definitely desirable and it's not necessarily a blood out of a stone situation.
     
    Last edited: Mar 22, 2024
  11. poetinsf

    poetinsf Senior Member (Voting Rights)

    Messages:
    341
    Location:
    Western US
    Not important, but you can always export it as csv from Google Sheet and then read.csv from R if you don't have Excel. (I'm tired of MS asking for my CC number whenever I open up Excel).
     
    Peter Trewhitt, Murph and Kitty like this.
  12. Creekside

    Creekside Senior Member (Voting Rights)

    Messages:
    1,217
    There's also the problem of how far the samples are from the root cause of ME. If fish are dying near some small town way up the Amazon river, taking samples at the mouth is probably not going to reveal the cause. If ME's "sore thumb" factor is localized in a small part of the brain, you may not find anything in serum samples, not matter how many studies you compile. The critical factor may also have a short lifespan and doesn't even reach the serum. Maybe the factor is relatively abundant in serum, and it's just the localized level that is abnormal. Maybe it's multiple factors, some elevated, some depressed, and the localized ratios causes the abnormal response. There's simply no guarantee that there is an answer to be found in serum or even CSF samples.

    I'm not against reprocessing the available data, since it's not impossible that it could reveal something. It's a matter of personal judgement of chance of success vs effort required.
     
  13. FMMM1

    FMMM1 Senior Member (Voting Rights)

    Messages:
    2,812
    I was surprised that untargeted metabolomics hadn't provided leads. I think Dr. Li (Jackson Laboratory) suggested the data is fairly limited [Metabolom*] They've found a few common differences but they haven't tested for much (% wise)**
    However, I'm not sure if there is much more they can discover re metabolomics by re-running that standard (1000 feature) commercial test*. Possibly Dr. Li could suggest an alterative to broaden the scope of the metabolomics search.


    *"Then metabolomics data were actually generated by metabolome commercial vendor." - https://event.roseliassociates.com/...on_final_MECFS-Research-Roadmap_Webinar-3.pdf

    **"So, our metabolomic data have some agreement with the recent two papers from Hanson Lab and from the Columbia group. So, see there are mostly on sphingomyelins, ceramides, cholesterols, and also hydroxyglutamate there."
     
  14. Sean

    Sean Moderator Staff Member

    Messages:
    8,064
    Location:
    Australia
    Yep, I think the lack of an answer so far is most likely due to simply not looking in the right place (or in the right way).

    Also suspect that we don't lack the technical tools to do it, just the conceptual basis on which to accurately target them.
     
  15. FMMM1

    FMMM1 Senior Member (Voting Rights)

    Messages:
    2,812
    Yea check out the Sheng Li, Ph.D. (Jackson Laboratory) talk (part of the NIH webinar series*) - note the slide states that the 1000 metabolite standard commercial test (i.e. Metabolon) "only a quarter" i.e. of the "total metabolites" (that's probably increasing too).
    I think metabolomics is basically used when you don't know the target pathway, gene --- hypothesis/concept free - similar to GWAS [DecodeME].
    Just wondering if we've someone who could explain the gaps in metabolite coverage and the options to fill in those gaps?

    The gaps aren't just in metabolite coverage - Dr Li also states:
    "So, for priorities, I think, I don't know if you're agree with me on it or not. So, I feel that this field is really missing state-of-art systems immunology. So, transcriptomic data from the blood, whether it's a bulk data or single-cell data, can inform a lot more things than whatever you have seen in this field."
    I wonder what @Jonathan Edwards view of that statement is?

    *Metabolism - https://event.roseliassociates.com/me-cfs-research-roadmap/recordings/ - 48.35



    upload_2024-3-23_10-45-41.png
     
    Simon M, Kitty and Murph like this.
  16. Murph

    Murph Senior Member (Voting Rights)

    Messages:
    147
    This is an excellent question. We can cut it by significance and we should. I'll try to do it myself at some point soon but if someone else wants to have a go before I get to that, that would be delightful!

    My intial sense is that for some pairs of studies cutting down to metabolites that are significant at p=.05 will leave no metabolites in common to compare between them! But it's worth doing the actual work to check that.
     
    Simon M, EndME, Kitty and 4 others like this.
  17. chillier

    chillier Senior Member (Voting Rights)

    Messages:
    237
    I've checked here using a linear model whether there is evidence for a given metabolite, that the two studies disagree in their fold change estimate (depends both on the fold change estimate but also the variance of the data).

    For the whole lot of common detected metabolites you can see the papers actually agree (or strictly speaking: don't necessarily disagree) on the vast majority of metabolites, it's just that what they're agreeing on is that there's no good evidence that the metabolite is different between ME and HC. Looking only at the significant ones you can see it's roughly half that disagree.

    allmetabs_bothpapers.png significantmetabs_bothpapers.png

    Anyway none of the metabolites that agree and are significant pass multiple test correction. The ones that pass multiple test correction and don't agree appear to be drug related metabolites. To be fair, Germain/Hanson 2022 already argue there isn't really much of a difference in the metabolomes at baseline, it's only after exercise that it gets interesting.

    (The p values here are recalculated from a combined analysis of both datasets with a log linear model and disagreement determined from an interaction term between group and paper)
     
    Simon M, Murph, Sean and 6 others like this.
  18. Trish

    Trish Moderator Staff Member

    Messages:
    55,414
    Location:
    UK
    Thanks for doing this, it looks like a useful analysis. Did Hanson's team do a comparison of this sort too?

    Is there data for after exercise that can be analysed in the same way? And is there any data for people not on any medication that might produce false positive results, and what male and female data separated? I guess for the women there may also be differences across their menstrual cycle. Also I'm guessing dietary differences may affect metabolites, eg if on keto or vegan diets?

    I'm no expert, but it seems logical to me that there are so many influences on metaboites in the blood that unless there's something that stands out and people say, wow, that's obviously very different, the rest is just noise.
    I think there were some differences they found in urine after exercise that seemed to show clear differences.
     
    Simon M, Murph, Sean and 4 others like this.
  19. chillier

    chillier Senior Member (Voting Rights)

    Messages:
    237
    Thank you :)

    They didn't do an analysis where they compared multiple datasets (from different papers). I mean that at baseline in their two day CPET paper (2022) they didn't report many metabolites having a difference between ME and HC. I think 7 metabolites. Then following exercise there were many more changes in glutamate metabolism related things.

    The urine paper I think is sampled from a subset of the same cohort and had really striking results as you say, but all those possible sources of bias you mention are still a concern, especially in light of some of the comments @Midnattsol has made about their experience with metabolomics data. You might hope that having a longitudinal design where you compare the differences between time points of the same group of people that might reduce the noise at least in part.

    I'd definitely like to compare post exercise plasma metabolomics, although I'm not aware that there are other papers that have done that to the quality of Hanson 2022 with their n=45? sized cohort and 4 time points.
     
    Simon M, Murph, Sean and 4 others like this.
  20. wigglethemouse

    wigglethemouse Senior Member (Voting Rights)

    Messages:
    1,009
    I think Hanson team discussed at one point the earlier studies were not ideal because samples were not processed quick enough and in a well controlled manner so levels were inconsistent due to half life effects on metabolites with shorter half lives. On more recent Hanson studies the sample handling and processing was better controlled for. This seems to be a big issue in metabolism studies.

    I think this reasoning partly led to the Hanson team longitudinal studies with an exercise challenge where patient to patient variances were better controlled as they were looking at differences before and after exercise in each individual and then comparing those differences between groups.

    I guess a simpler way to explain - garbage in garbage out if you don't implement optimized sample handling and processing.
     
    Simon M, Murph, Sean and 7 others like this.

Share This Page