Preprint Charting the Circulating Proteome in ME/CFS: Cross System Profiling and Mechanistic insights, 2025, Hoel, Fluge, Mella+

Maybe the fold change in Hoel can be averaged for rows with matching TargetFullName before merging with Germain.
Also see no good solution.

I kept them in, hence why there are two datapoints for, leptin or for 'Fatty acid-binding protein, adipocyte'. For example, if there are two rows for leptin in the Hoel dataset and 1 in the Germain dataset, it would duplicate the latter and have 2 rows for leptin in the merged dataset.
That's not entirely correct as some datapoint are counted twice but given that this only occurs for a limited number of proteins (I found it for 52 out of 605) I think the correlation still gives a crude estimation of how the results of the two studies match?

Was mostly interested to see which proteins are high or low in both datasets.
 
IL26 is one of the top hits in this study.
From gene cards:
This gene was identified by its overexpression specifically in herpesvirus samimiri-transformed T cells. The encoded protein is a member of the IL10 family of cytokines. It is a secreted protein and may function as a homodimer. This protein is thought to contribute to the transformed phenotype of T cells after infection by herpesvirus samimiri. [provided by RefSeq, Jul 2008]

IL26 (Interleukin 26) is a Protein Coding gene. Diseases associated with IL26 include Crohn's Disease and Hidradenitis Suppurativa. Among its related pathways are MIF Mediated Glucocorticoid Regulation and TGF-Beta Pathway. Gene Ontology (GO) annotations related to this gene include cytokine activity. An important paralog of this gene is IL10.

May play a role in local mechanisms of mucosal immunity and seems to have a pro-inflammatory function. May play a role in inflammatory bowel disease. Activates STAT1 and STAT3, MAPK1/3 (ERK1/2), JUN and AKT. Induces expression of SOCS3, TNF-alpha and IL-8, secretion of IL-8 and IL-10 and surface expression of ICAM1. Decreases proliferation of intestinal epithelial cells. Is inhibited by heparin. ( IL26_HUMAN,Q9NPH9 )

There is a recent review of IL26
Immunobiology of IL-26
T helper 17 (Th17) cells produce a set of cytokines that include IL-17 family members, IL-21, IL-22, and IL-26. These cytokines all contribute to the classic function of Th17 cells in combatting extracellular infection and promoting inflammation in autoimmune diseases. However, of the Th17 cytokines, only IL-26 has direct antimicrobial activityagainst microbes and can activate a broad range of immune cells through its ability to bind DNA and trigger pattern recognition receptors. It is noteworthy that IL-26 is produced by mammalian cells, including human Th17 cells, but is absent in rodents. As such, IL-26 is a potential therapeutic target to augment host immune responses against microbial pathogens but also to prevent inflammation and tissue damage in a variety of autoimmune diseases.

On the relationship between IL-26 and interferon gamma:
Relationship between IL-26 and IFN-γ (AI response)
  • Co-expression:
    IL-26 and IFN-γ are often co-expressed in various immune cells, particularly T lymphocytes, suggesting a coordinated role in immune responses.
  • Close Gene Location:
    The genes for IL-22, IL-26, and IFN-γ are located in close proximity on chromosome 12q15, suggesting they may be regulated together.
  • Similar Functions:
    Both IL-26 and IFN-γ can activate certain signaling pathways (STAT1 and STAT3), suggesting they may share some functional overlap.
  • Antiviral Synergy:
    IL-26 and IFN-γ can work synergistically to enhance antiviral responses, as IL-26 can stimulate the production of IFN-γ and vice versa.
  • Potential Role in Autoimmune Diseases:
    Polymorphisms in the IFN-γ/IL-26 gene region have been linked to sex bias in susceptibility to rheumatoid arthritis, suggesting a role for these cytokines in autoimmune diseases.
 
@ME/CFS Skeptic My plot looks the same by the way.

Here are the three genes that were changed in the same direction in both studies and had a q value less than .05 in both:
TargetFullName | EntrezGeneSymbol_hoel | log2(FC)_hoel | log2(FC)_germain | q-value_germain

Cellular retinoic acid-binding protein 2 | CRABP2 | 0.37 | 0.56 | 0.035

Peroxidasin homolog | PXDN | 0.15 | 0.48 | 0.038

Ribonuclease pancreatic | RNASE1 | 0.19 | 0.56 | 0.035

Edit: Added links to GeneCards.
 
Last edited:
@ME/CFS Skeptic My plot looks the same by the way.
Thanks for checking! I hope will see more studies like this that test large amounts of data and make it all available online. It offers so much more possibilities for us (and other researchers) to check particular results and compare them to other studies.

Making data available online is a strong sign that researchers mean serious business (that they do science instead of academics). The Keller study, Hanson group and this Norwegian team all seem truly dedicated to unravelling ME/CFS.
 
I wonder if increased leptin and fatty acid-binding proteins (FABPs) are simply due to ME/CFS patients having relatively more fat. Patients and controls have a similar BMI but the composition of their body mass might be different: more fat and less muscle for ME/CFS patients?
 
Last edited:
Cellular retinoic acid-binding protein 2 | CRABP2
gene card said:
This gene encodes a member of the retinoic acid (RA, a form of vitamin A) binding protein family and lipocalin/cytosolic fatty-acid binding protein family. The protein is a cytosol-to-nuclear shuttling protein, which facilitates RA binding to its cognate receptor complex and transfer to the nucleus. It is involved in the retinoid signaling pathway, and is associated with increased circulating low-density lipoprotein cholesterol.

CRABP2 may play a crucial role in modulating the host's response to viral infections, potentially contributing to both the control of viral replication and the activation of the immune system.

Retinoic Acid in the Immune System
CRABPs are also found as two isoforms with all-trans RA their high-affinity ligand. A differential cell-specific pattern of expression is observed between the two isoforms of CRABPs. CRABP-I is expressed ubiquitously, while CRABP-II expression is limited to the ovaries, uterus, and skin, and correlates in expression to cell types that secrete large amounts of all-trans RA.14 These two all-trans RA binding proteins exhibit distinct functions, with CRAB-I modulating all-trans RA bioavailability by enhancing its metabolism and tempering its intracytosol presence, and CRABP-II delivering all-trans RA directly to the nuclear RA Receptors (RARs), thus allowing its capture by the RARs.19 Once delivered to the RARs by holo-CRABP-II, all-trans RA complexed to RARs can mediate profound biological effects on cellular differentiation, proliferation, and apoptosis.

So, as its name suggests, Cellular Retinoic Acid-binding Protein 2 binds retinoic acid in the cell and shuttles it into the cell nucleus. What it's doing in possibly higher than normal levels in ME/CFS blood, I'm not sure. But it does seem to increase when there is a viral infection. There are two forms of CRABP, the one that we are talking about is CRABP2 or CRABPII. I got a bit excited about that report above that CRABP2 has sex differences in expression (i.e. in ovaries and uterus), but the gene card suggests that lots of tissues express it.

Downstream, assuming more CRABP2, it's looking pretty complicated. Assuming more CRABP2 means more retinoic acid affecting protein production: Retinoic acid seems to have different effects on macrophages and monocytes>dendritic cells:
Again, IL-12 levels were reduced when RA-treated macrophages were used as antigen presenting cells (APCs) in cocultures, and more importantly, T cell–derived IFN-γ and IL-4 levels were downregulated and upregulated, respectively.

RA and GM-CSF enhanced the differentiation of monocytes into dendritic-like cells. These RA-DCs exhibited DC morphology and had the phenotype of immature DCs, with increased expression of CD1 a, adhesion, and costimulatory molecules. RA-DCs were more effective at inducing CD4+ T cell proliferative responses and increased IL12 production and, unlike macrophages, drove T cells toward an IL12-dependent T-helper cell type 1 response with secretion of IFNg.39
Link to that reference 39
 
3. Hidden in plain sight is one increased protein of interest (the top one): MCTS1. Gene card says: "Notably, it positively regulates interferon gamma immunity to mycobacteria by enhancing the translation of JAK2 (PubMed:37875108)."

Sorry to be the same broken record: obligatory mention that JAK2 is not exclusive to interferon gamma
 
Last edited:
Thanks for checking! I hope will see more studies like this that test large amounts of data and make it all available online. It offers so much more possibilities for us (and other researchers) to check particular results and compare them to other studies.

Making data available online is a strong sign that researchers mean serious business (that they do science instead of academics). The Keller study, Hanson group and this Norwegian team all seem truly dedicated to unravelling ME/CFS.

Most journals mandate this these days (for omics data) so we will see pretty much everybody doing it in future
 
ME/CFS Skeptic said:
I think the correlation still gives a crude estimation of how the results of the two studies match?

Yeah, I think it's good enough for a rough idea.

Should you filter for low p values before running your correlation? if the studies truly correspond you'd find lots of things measured at 1:1 in both studies, there will be a cloud around the centre of the plot and it won't affect your r^2. I think running the correlation on the full dataset is a simple unbiased way to do it that retains the full context.


When I took this approach for some studies a few years ago I even found a negative correlation between studies in some cases (comparing naviaux against some others, amde me wonder if naviaux had aciidentally got his numerator and denominator confused): https://www.s4me.info/threads/mecfs-data-analysis-thread.37775/ I also found a disappointing number of "deep, untargeted" metabolomic studies that had almost no crossover with other deep untargeted metabolomic studies in terms of line items. Theres' just so much a lab could measure!

Nevertheless I love diving in on this, it seems like if a person made a meta dataset of all the metabolomics data (starting with a list of the thousands of molecules measured and seeing which ones have een measured more than once) we could then exploit that.

That said it's a ton of work, as you guys found there's a lot of ways to annotate the name of a protein, metabolite or other molecule, and a dozen different numbering systems.

standards.png

I found it really hard to match every single line in two databases, whether by KEGG or looking at the names manually, you'd find so many things measured that may or may not be the same (chirality, isotopes, a category that has since been expanded into two subtypes, a names used in only one country, names that became outdated, and a million more examples besides.)
 
Since they did the validation portion, it seemed worth checking if there was overlap between the 57 proteins in that part with the thousands in the first part and in Germain. There were overlaps in 51 genes between all three cohorts. In 19 genes, the fold change was in the same direction. Only one had a q-value less than .5 (yes, .5 not .05) in all three:

Fatty acid-binding protein, adipocyte - FABP4
hoel aptamer: q=.0001, log2fc=0.48
hoel antibody: q=.063, log2fc=0.51
germain aptamer: q=.257, log2fc=0.56

Maybe the following is something. Is "muscle sympathetic nerve activity" something that could be higher in ME/CFS, maybe more of an increase after exercise, and could be making FABP4 higher?

The relationship between muscle sympathetic nerve activity and serum fatty acid binding protein 4 at rest and during isometric handgrip exercise (2024, Physiol Rep)
Abstract
Fatty acid binding protein 4 (FABP4) is highly expressed in adipocytes. Lipolysis, caused by an elevated adrenergic input, has been suggested to contribute to elevated serum FABP4 levels in patients with cardiovascular diseases. However, the relationship between the serum FABP4 and efferent sympathetic nerve activity remains poorly understood.

Twenty-one healthy subjects (average age, 29.1 years; 15 men) performed an isometric handgrip (HG) exercise at 30% of the maximal voluntary contraction until they were fatigued. The beat-by-beat heart rate (HR), blood pressure (BP), and muscle sympathetic nerve activity (MSNA) were recorded. Blood samples were collected at rest and at the time of peak fatigue.

The MSNA, HR, and systolic BP were significantly increased by the HG exercise (all, p < 0.05). MSNA was obtained from 14 patients. The change in the FABP4 on HG exercise was significantly correlated with the change in MSNA (bursts/100 heartbeats) (R = 0.808, p < 0.001) but not with changes in other parameters, which might, in part, reflect an association of efferent sympathetic drive with FABP4. Meanwhile, resting FABP4 levels were not associated with any parameters including MSNA, in healthy individuals.

Future studies on patients with elevated sympathetic activity are warranted to examine the relationship further.
 
Nevertheless I love diving in on this, it seems like if a person made a meta dataset of all the metabolomics data (starting with a list of the thousands of molecules measured and seeing which ones have een measured more than once) we could then exploit that.
Yes the fact that this is not a thing has been irritating me all day as I crudely try to match up genes from one study with a single other study. It basically feels like it would be free data if such a database existed. There are many 'omics studies that would be great to cross-reference against.

I imagine there must also be good algorithms for dealing with summary statistics from different studies to find the most promising targets, instead of just kind of filtering by arbitrary p-values or looking at a chart for what looks high in two studies at once. More than that, it'd be useful if it could analyze related genes instead of just matching up genes that are exactly the same.

If every study would not only attach a supplementary data spreadsheet to their paper, but also submit the data in a nice format to a centralized database, which can also run these algorithms, I think we'd be dealing with a much more statistical power to find proteins of interest.

I think maybe mapMECFS was planning to do an actual standardized database? I'm not sure, I no longer have access as they started requiring institutional email addresses.

I just can't believe it's not worth it. It would be even better for it to be all diseases, not just ME/CFS, so that one could compare between diseases and benefit from economies of scale. To instead run these huge panels in studies, list a few significant proteins in the text, and basically forget the rest of the not super significant proteins feels like a huge waste.
 
@ME/CFS Skeptic My plot looks the same by the way.

Here are the three genes that were changed in the same direction in both studies and had a q value less than .05 in both:
...
Peroxidasin homolog | PXDN | 0.15 | 0.48 | 0.038
...

Peroxidasin is intersting to me, it might help explain POTS

Mammalian Peroxidasin (PXDN): From Physiology to Pathology

PXDN expresses in the endothelial cells and secretes into blood. PXDN exhibits with much higher concentration in plasma than MPO [20]. Therefore, it is reasonable to speculate that PXDN also plays an important role in vascular tone under physiological and pathological conditions.



According to that review it also seems to be involved in extracellular matrix and fibronectin. i don't know too much about extracellular matrix and fibronectin but I do know that these are bits of biology that keep coming up! Collagen-associated functions suggest a possible link to Ehlers Danlos or similar connective tissue issues.
 
A good spreadhseet for the human eye would have studies in rows and molecules in columns.

But I think this data would be too big for the eye. You'd need what they call long data, or tidy data, where each row contains only one patient-to-control ratio, and many markers by which you can filter it.

A good machine readable table would have perhaps the following columns:

molecule common name |control to patient ratio | year | study | fluid studied (e.g. serum, plasma, saliva, urine) | target (e.g. amino acids, proteins, immune proteins) | total sample size | male sample size | female sample size | sex to which ratio applies (M/F/All) | molecule other name | KEGG | PubChem ID | OR minimum | OR max | sd | published p-value | published q-value | exercise provocation status | Pathway molecule is in (general) | Pathway molecule is in (detailed) | data source url

if we could wrestle a few datasources into that format then we could put them together and filter the whole thing by molecule/Gender etc to find what various studies said.

What you'd lose is the richness in some of the Hanson datasets where she provides patient-by-patient data, which can be really useful to look at. But it'd be super valuable and easy to append new findings.

Perhaps once you had 30,000 rows you can blast through some statistical noise and see patterns ,or perhaps after some molecules had come up in 10 studies without ever showing any consistency, you could make the argument metabolomics is a false friend and a waste of effort. Either way it's useful.
 
Last edited:
Should you filter for low p values before running your correlation? if the studies truly correspond you'd find lots of things measured at 1:1 in both studies, there will be a cloud around the centre of the plot and it won't affect your r^2. I think running the correlation on the full dataset is a simple unbiased way to do it that retains the full context.
Here's with merging on all rows from both studies using TargetFullName. Spearman r of .23
upload_2025-6-2_0-17-28.png
 
Back
Top Bottom