Preprint Initial findings from the DecodeME genome-wide association study of myalgic encephalomyelitis/chronic fatigue syndrome, 2025, DecodeMe Collaboration

So it's like they used all possible genes but weighed them by how much the SNP signal from the GWAS points to them? That would make more sense and make there results more interesting.
Yes, that's how I understand it. For each gene, they look at all the many SNPs that are in or around that area of the DNA where the gene's code is located to give that gene a score based on how significant those nearby SNPs were in the GWAS (how high they are in the manhattan plot). They account for linkage disequilibrium between SNPs to not "double-count" a genetic signal in a gene's score if multiple SNPs are all significant together just because of LD.
 
For each gene, they look at all the many SNPs that are in or around that area of the DNA where the gene's code is located to give that gene a score based on how significant those nearby SNPs were in the GWAS (how high they are in the manhattan plot)
In that case it might be quite important. I wonder if we should interpret the likelihood of possible genes in light of this MAGMA analysis: those that are not expressed in the brain might be less likely to be a relevant gene compared to those who are highly expressed in the brain (Figure 4 In the paper)?
 
In that case it might be quite important. I wonder if we should interpret the likelihood of possible genes in light of this MAGMA analysis: those that are not expressed in the brain might be less likely to be a relevant gene compared to those who are highly expressed in the brain (Figure 4 In the paper)?
Hmm. I think on its face that does make sense (though not sure how much confidence this actually allows us to have in selecting genes based on expression). I wonder if any of the papers I linked above that did MAGMA tissue analyses did anything similar. I didn't read any yet, just grabbed the plots.

Edit: I'm just thinking it still might be possible that one of the significant loci has to do with another part of the body, so I'm worried about dismissing a non brain gene prematurely.

I'm thinking maybe the plot shows that the brain is important, but doesn't necessarily conclusively say which parts of the body are not important.

Testes, EBV-transformed lymphocytes, and muscle are next highest after brain (though not significant after adjustment). The second two make some sense as well for ME/CFS.

Edit 2: Actually looks like they're not even significant before correction. Both around p of 0.1).
 
Last edited:
my feeling is that it's really early for anyone to be saying with much confidence that the genes they found point to any specific pathway.

I tend to agree. For most of the loci there are multiple potential genes implicated and each of the genes are involved in multiple pathways.
My feeling is that it’s worth making a distinction between “these identified genes have previously been found to be important in immune and nervous system function, suggestion that those broad areas are interesting directions for future focus” and “these results show that the illness is driven by neuro-immune pathways.”

You’re both definitely right that those genes are not exclusively expressed in either system, and we have no way of knowing which functions of those genes are relevant. Or if they are relevant in maintaining disease state or indirectly predisposing to its trigger.

And like you said @forestglip , anyone with a bit of time on their hands could probably link several of these genes to spin whatever story they wanted. Hell, AI can do it for you.

I do also agree with @chillier that there’s a need to generate good testable hypotheses. In which case these genes would be an additional piece of evidence in favor of a hypothesis, but the hypothesis should also be able to stand on its own.

I certainly don’t think these results should be used to exclude a viable hypothesis if other evidence or reasoning points in its favor—trying to push all ME/CFS research into a strictly neuro-immune boat solely on the basis of these results would be shooting ourselves in the foot. But it does make that look like a more promising direction overall.
 
Many thanks. Try as I might, I cannot find anything about non-coding genes. The paper states "There were 43 protein-coding genes with at least one eQTL within an ME/CFS genome-wide significant interval, and we prioritised 29 ME/CFS candidate causal genes among them..." which sounds like they only looked at protein-coding genes.
I think the GWAS arrays were specifically designed to focus on protein coding genes. Just like Whole Exome Sequencing. It's only relatively recently that Whole Genome Sequencing has become more competitively priced for full coverage, but still more expensive than GWAS arrays.
 
How are we to reconcile the somewhat different genes and tissues of:

Fig. 3. MAGMA gene-tissue analysis shows statistically significant enrichment of ME/CFS-related genes in all 13 brain tissues.
Fig. 4: Approximate Bayes factor posterior probability (PPH4) that mRNA expression and ME/CFS traits are associated and share a single causal variant.
 
How are we to reconcile the somewhat different genes and tissues of:

Fig. 3. MAGMA gene-tissue analysis shows statistically significant enrichment of ME/CFS-related genes in all 13 brain tissues.
Fig. 4: Approximate Bayes factor posterior probability (PPH4) that mRNA expression and ME/CFS traits are associated and share a single causal variant.
They're just different methods of predicting the important genes, so the results won't be exactly the same. (Again, keep in mind those 13 genes aren't necessarily related to the brain enrichment.)

The candidate genes were determined based on genes known to be differentially expressed due to a significant variant. The 13 MAGMA genes were basically just the genes with the most highly significant variants nearby.

Lots of the candidate genes in fig. 4 show brain expression, so I don't see a disconnect with the tissue enrichment part.
 
But it's hard to say how many immune-related links we would expect with 8 hits. We would have to random sample some SNP hits or loci, count the number of potential implicated genes and their immune-related pathways. It would be a lot of counting and not entirely objective.
Maybe a better way is to estimate the number of immune genes. I found one at about 1600 genes . https://pubmed.ncbi.nlm.nih.gov/15789058/Let's say 2000 – or about 10% of all human genes.

As far as I can see, quite a lot more than 10% of candidate genes linked to the DecodeME genetic signals are immune genes.

Never better, and more recent estimates than the one I found. But I would be surprised if the proportion of immune genes was much over 10%.
 
and come up with reasonable ideas of what experiments to do next, besides more genetics.
It sounds like an interesting thing to do. But he thought anyone commits resources to do any experiment, it would be useful to have the planned analysis narrow down the like candidates. Which would also presume they have a big impact on the credible hypothesis. Maybe it's because I've been ill so long but waiting some more months to get a clearer view of the likely target seems prudent.
 
So while some or all of the 13 top genes might potentially be wrong, the tissue analysis includes every gene.
Could you clarify that, please? The way it was explained to me, magma looks at the set of 13 genes only, and compares them with other genes. It doesn't look at any other ME/CFS genes. Are you saying that's not the case? I may have misunderstood you or what was explained to me originally.
 
As far as I can see, quite a lot more than 10% of candidate genes linked to the DecodeME genetic signals are immune genes.

I think there may be a problem in that immune genes are much more likely to be polymorphic as part of a 'strategy of diversity' to combat changing pathogen environments. MHC genes can be ludicrously polymorphic. A lot of non-immune genes may have virtually no variants beyond a few rare disease-determining ones.
 
Could you clarify that, please? The way it was explained to me, magma looks at the set of 13 genes only, and compares them with other genes. It doesn't look at any other ME/CFS genes. Are you saying that's not the case? I may have misunderstood you or what was explained to me originally.

MAGMA manual if anyone wants to look.

DecodeME says (bolding added):
Thirteen genes were significantly associated with ME/CFS in a MAGMA gene-based test of 18,637 genes (p < 0.05/18637; Table S4). We considered 54 tissue types and identified significant enrichment of these genes’ expression for 13 (p < 0.05/54), all of which were brain regions (Fig. 3).
"These genes" does seem to imply that they looked at expression of specifically those 13 genes. Maybe it's just the wording.

The bits I understand in the MAGMA papers seem to indicate that the gene-property analysis has a continuous dependent variable Z which is each gene's GWAS score. But I'm out of my depth with the complicated terminology in those papers, so I'm not going to pretend I'm positive that's correct. Hopefully an author or someone more knowledgeable can clarify.
 
@Chris Ponting Would you be kind enough to help us interpret the MAGMA analysis paragraph in the paper (discussion in above post).

Are the 13 genes in table S4 found from gene based analysis only and then tested for tissue, or is the list a gene-tissue enrichment analysis presented as a gene set - table S4 and then tested against all tissue types in Fig 3. Or is fig 3 showing something different? Which analysis is the last sentence on significance referring too - is it a tissue one or non-tissue one?
Thank you. Table S4 genes are from the gene based analysis. And then these genes were tested against the tissue types in Fig 3, which found significant association to 13 brain tissues.
 
Thank you. Table S4 genes are from the gene based analysis. And then these genes were tested against the tissue types in Fig 3, which found significant association to 13 brain tissues.
This non-scientist's understanding would benefit from knowing the variables involved in the gene-set analyses:
Z = B0 + C1.B1 + ... + CnBn + e
 
@Chris Ponting, as arnoble says, can you clarify, was the tissue enrichment based on putting those 13 genes into a discrete gene set and seeing if those 13 genes specifically, without regard for their actual p-values/z-scores, were enriched among all genes expressed in a tissue?

Or was every one of the ~18,000 genes' z-scores considered in the enrichment in a continuous manner (where if genes with high z-scores, which includes, but is not limited to, those top 13, have high expression, and genes with low z-scores have low expression, then the genes are considered to be enriched in the tissue)?

Figure 3 says this was a "MAGMA gene-tissue analysis". Looking at the cited MAGMA paper, the only equations I see for their gene-set analyses have Z as the dependent variable in a linear regression. With Z being:
To perform the gene-set analysis, for each gene g the gene p-value pg computed with the gene analysis is converted to a Z-value Zg = Φ−1(1 – pg), where Φ−1 is the probit function.

Edit: The paper says it used the FUMA platform. FUMA includes MAGMA, so I assume that's where the MAGMA analysis is done. The documentation details the method for tissue analysis, which looks like it uses the z-score for every gene.
 
Last edited:
I think there may be a problem in that immune genes are much more likely to be polymorphic as part of a 'strategy of diversity' to combat changing pathogen environments. MHC genes can be ludicrously polymorphic. A lot of non-immune genes may have virtually no variants beyond a few rare disease-determining ones.
Thanks. There were no HLA/MHC gene flagged as likely candidates in DecodeME. Excluding HLA, do you have any feel for what percentage of human genes are immune ones?
 
There were no HLA/MHC gene flagged as likely candidates in DecodeME.

Well, there was the DQ signal, which is yet to be sorted out and did not gt into the list for what appear to be technical reasons. But I was using MHC simply as an illustration. I suspect there are lots of polymorphisms in receptors in immune system genes - FcRIIIa, common cytokine receptor, etc, and complement factor 4 is very polymorphic, including duplications, mannose binding lectin is quite often absent .... and so on.

I have no idea what proportion we are talking about but because these genes are involved in a a whole lot of alternative partially redundant strategies I suspect that they are generally more polymorphic. That may not matter but I worry that some of these background issues are not fully taken into account.
 
Back
Top Bottom