Identification of a multi-omics factor predictive of long COVID in the IMPACC study, 2025, Gabernet et al.

Hutan · Mar 20, 2025

jnmaciuch said:
That's one of the reasons I was so excited that the best performing model ended up being the one trained on PROMIS Physical Scores--out of the patient-reported outcomes that were measured, I felt that it would come the closest towards measuring ME/CFS. Obviously a more in-depth patient assessment would be needed to confirm ME/CFS. Some of the study participants are still being seen in some of the site LC clinics, so there is potential to do a more thorough diagnostic assessment for participants that meet ME/CFS criteria and retroactively label the samples from those participants. Unfortunately I can't make promises on that, but I can tell you I've already been looking into it.

That's exciting. Good luck with that and thanks for your replies.

bobbler · Mar 20, 2025

jnmaciuch said:
Oh sure! It’s absolutely understandable to find it confusing since there’s a technical definition and a more colloquial meaning, and it can be hard to parse which is intended.

Technically, rate-limiting step refers to whichever step is the slowest in a chain of reactions where the output of one reaction is the input of the next. Think of cholesterol getting converted to pregnenolone and then a bunch of other forms before finally becoming testosterone or cortisol. Each individual reaction (e.g. cholesterol -> pregnenolone) has its own rate which is determined by a host of factors. In a chain, the slowest reaction sets the maximum rate for anything downstream, since those later reactions can’t happen without the first one. To use a more accessible example, the slowest driver on a one-lane street sets the maximum speed for everyone else. So, theoretically, if you can identify the slowest driver and measure their speed, you know the speed of everyone else.

According to my old bio prof, that definition has gone out of fashion since we now know that rates of multi-step reactions in biological systems are often regulated at many points (usually via the enzymes involved), not just the rate-limiting step.

So the colloquial use, which I’m using in the quote, generally tends to mean “the step at which things are getting held up,” but it doesn’t necessarily mean that you can derive the numerical rate of everything downstream by measuring that one reaction.

That’s brilliantly clarified it! Thank you!

Sean · Mar 21, 2025

jnmaciuch said:
I hope this addresses your question!

It does. Thanks.

mariovitali · Mar 28, 2025

@jnmaciuch Thank you so much for your work. One of the problematic areas in metabolomics and its results is to try to understand why certain metabolites are different from healthy controls. I would like to provide an example of how different types of analysis can be used to -hopefully- help in generating some hypotheses.

I took the following concepts from the study and submitted them to an Information retrieval system : FGF23 AND FGF21 AND CXCL9 AND TNFRSF11B AND TNFRSF9 AND MMP10 AND CSF1

Here are the ranked results (the second snapshot on the right is actually the second page of the results) :

I believe I have seen you mentioning macrophages. My question here is -as an example- why are proteoglycans the top ranked concept? Also, why do glycoproteins appear so high (tightly associated with proteoglycans?)

Feel free to PM me if such tools may be of interest to your work.

jnmaciuch · Mar 28, 2025

mariovitali said:
@jnmaciuch Thank you so much for your work. One of the problematic areas in metabolomics and its results is to try to understand why certain metabolites are different from healthy controls. I would like to provide an example of how different types of analysis can be used to -hopefully- help in generating some hypotheses.

I took the following concepts from the study and submitted them to an Information retrieval system : FGF23 AND FGF21 AND CXCL9 AND TNFRSF11B AND TNFRSF9 AND MMP10 AND CSF1

Here are the ranked results (the second snapshot on the right is actually the second page of the results) :

View attachment 25672 View attachment 25673

I believe I have seen you mentioning macrophages. My question here is -as an example- why are proteoglycans the top ranked concept? Also, why do glycoproteins appear so high (tightly associated with proteoglycans?)

Feel free to PM me if such tools may be of interest to your work.

Thanks for your engagement! I actually did a similar analysis when I was trying to characterize some of our findings for the text. Unfortunately, it didn’t provide much of anything useful—at least nothing that some Google searching or our pathway analysis didn’t already reveal.

I suspect that your results here are primarily due to the fact that all the analytes in your short list are from the Serum O-link target 96 inflammation assay. Cytokine interactions with proteoglycans was (and is) a hot topic in extra-cellular matrix studies, so the literature is pretty saturated with text that links lists of cytokines to proteoglycans.

Macrophages are likely cropping up for the exact same reason—macrophages are important secretors of cytokines so a lot of literature will link macrophages and cytokines together. Given that this was a serum assay where macrophages aren’t present, it’s more likely that this particular signature comes from other cell types.

Either that, or the proteoglycan and macrophage hits are both being driven by references to MMP10, which is a proteoglycan degrading enzyme highly expressed in macrophages and has been extensively characterized in the literature.

I’m not sure what system you’re using, but you could confirm this by feeding different randomized lists of features from that assay and seeing if the results are unique to that first list.

As it happens, I just wrapped up a rotation in a lab that does a lot of meta-analysis of scientific literature, often using AI tools. One of the big concerns in the lab was the issue of existing bias in the literature: an analyte gets studied extensively in one context, others cite that finding, methods get developed to measure that analyte and its effects very well, and then those methods prompt people to see if the same analyte crops up in other contexts. It’s sort of a positive feedback loop of literature prevalence—it doesn’t mean the analyte isn’t important, but any text scraping tool is going to conflate literature prevalence with relevance in most cases.

I think those sorts of tools might be useful in certain cases, though overwhelmingly in my use of them, they tend to just recapitulate what’s already been extensively discussed regardless of whether it’s relevant to the topic at hand.
But I still find it very useful as a first step to get a sense of where results might intersect with existing literature!
Thanks again for the discussion!

SNT Gatchaman · Sep 10, 2025

Now published as —

A multi-omics recovery factor predicts long COVID in the IMPACC study

Gisela Gabernet; Jessica Maciuch; Jeremy P Gygi; John F Moore; Annmarie Hoch; Caitlin Syphurs; Tianyi Chu; Naresh Doni Jayavelu; David B Corry; Farrah Kheradmand; Lindsey R Baden; Rafick-Pierre Sekaly; Grace A McComsey; Elias K Haddad; Charles B Cairns; Nadine Rouphael; Ana Fernandez-Sesma; Viviana Simon; Jordan P Metcalf; Nelson I Agudelo Higuita; Catherine L Hough; William B Messer; Mark M Davis; Kari C Nadeau; Bali Pulendran; Monica Kraft; Chris Bime; Elaine F Reed; Joanna Schaenman; David J Erle; Carolyn S Calfee; Mark A Atkinson; Scott C Brakenridge; Esther Melamed; Albert C Shaw; David A Hafler; Alison D Augustine; Patrice M Becker; Al Ozonoff; Steven E Bosinger; Walter Eckalbar; Holden T Maecker; Seunghee Kim-Schulze; Hanno Steen; Florian Krammer; Kerstin Westendorf; Impacc Network; Bjoern Peters; Slim Fourati; Matthew C Altman; Ofer Levy; Kinga K Smolen; Ruth R Montgomery; Joann Diray-Arce; Steven H Kleinstein; Leying Guan; Lauren Ir Ehrlich

BACKGROUND
Following SARS-CoV-2 infection, ~10-35% of COVID-19 patients experience long COVID (LC), in which debilitating symptoms persist for at least three months. Elucidating biologic underpinnings of LC could identify therapeutic opportunities.

METHODS
We utilized machine learning methods on biologic analytes provided over 12-months after hospital discharge from >500 COVID-19 patients in the IMPACC cohort to identify a multi-omics “recovery factor”, trained on patient-reported physical function survey scores. Immune profiling data included PBMC transcriptomics, serum O-link and plasma proteomics, plasma metabolomics, and blood CyTOF protein levels. Recovery factor scores were tested for association with LC, disease severity, clinical parameters, and immune subset frequencies. Enrichment analyses identified biologic pathways associated with recovery factor scores.

RESULTS
LC participants had lower recovery factor scores compared to recovered participants. Recovery factor scores predicted LC as early as hospital admission, irrespective of acute COVID-19 severity. Biologic characterization revealed increased inflammatory mediators, elevated signatures of heme metabolism, and decreased androgenic steroids as predictive and ongoing biomarkers of LC. Lower recovery factor scores were associated with reduced lymphocyte and increased myeloid cell frequencies. The observed signatures are consistent with persistent inflammation driving anemia and stress erythropoiesis as major biologic underpinnings of LC.

CONCLUSION
The multi-omics recovery factor identifies patients at risk of LC early after SARS-CoV-2 infection and reveals LC biomarkers and potential treatment targets.

TRIAL REGISTRATION
ClinicalTrials.gov NCT04378777.

Web | PDF | The Journal of Clinical Investigation | Open Access

Identification of a multi-omics factor predictive of long COVID in the IMPACC study, 2025, Gabernet et al.

Hutan

Moderator

bobbler

Senior Member (Voting Rights)

Sean

Moderator

mariovitali

Senior Member (Voting Rights)

jnmaciuch

Senior Member (Voting Rights)

SNT Gatchaman

Senior Member (Voting Rights)