Preprint Initial findings from the DecodeME genome-wide association study of myalgic encephalomyelitis/chronic fatigue syndrome, 2025, DecodeMe Collaboration

@forestglip I am not sure if you have seen this analysis by Paolo Maccallini :

https://github.com/paolomaccallini-hub/MetaME?tab=readme-ov-file

I am particularly interested in genes EP300 and UGP2. Can these two be significant targets?
Interesting that in this meta-analysis, there actually is a significant gene set enrichment in MAGMA: GOCC_GLUTAMATERGIC_SYNAPSE

Paolo commented this on ME/CFS Science Blog's blog about overlapping controls, so I'm not sure how much impact this may have had on the results:
There is a problem though: DecodeME and UK Biobank have overlapping controls. Now I am trying to solve this problem using a correction of the weights used in the meta analysis.

I don't know enough about the method used for the meta-analysis calculation to really comment on it. But assuming it's valid, there are still 27 resulting candidate genes at the chr22 locus which includes EP300. So that could be a gene of interest, or maybe it's another of those many gene options. And there are four candidate genes at the chr2 locus that includes UGP2.
 
Interesting that in this meta-analysis, there actually is a significant gene set enrichment in MAGMA: GOCC_GLUTAMATERGIC_SYNAPSE

Paolo commented this on ME/CFS Science Blog's blog about overlapping controls, so I'm not sure how much impact this may have had on the results:


I don't know enough about the method used for the meta-analysis calculation to really comment on it. But assuming it's valid, there are still 27 resulting candidate genes at the chr22 locus which includes EP300. So that could be a gene of interest, or maybe it's another of those many gene options. And there are four candidate genes at the chr2 locus that includes UGP2.


Thanks @forestglip , yes I wonder if a key mechanism here is excitotoxicity via glutamate (and perhaps quinolinic acid). Another interesting finding can potentially be the gene ACO2 (exists also in the list of genes) which could be linked to the itaconate shunt hypothesis. If we have indeed issues with glutamatergic synapses coupled with ER stress and impaired ER autophagy (=ER-phagy) in the Endoplasmic reticulum then we may have a perfect storm taking place.
 
Here's an attempt to bring together all of the main candidate genes from different sources: Tier 1 genes, tier 2 genes, genes significant in MAGMA, and the gene (or two genes if it is not clear) closest to a locus for the top 25 loci.

GeneMethodCHRLead variant position (GRCh38)Reference alleleEffect alleleLead variant IDLead variant p-value
LRRC7MAGMA169696474AG1:69696474:A:G2.06E-07
xxxxx
NEGR1Nearest (1 of 2)173126414CCA1:73126414:C:CA1.19E-07
LRRIQ3Nearest (1 of 2)173126414CCA1:73126414:C:CA1.19E-07
xxxxx
ZNF644Nearest191028158CT1:91028158:C:T1.89E-07
xxxxx
RABGAP1LTier 11173846152TC1:173846152:T:C2.56E-08
DARS2Tier 1, MAGMA, Nearest1173846152TC1:173846152:T:C2.56E-08
RC3H1Tier 11173846152TC1:173846152:T:C2.56E-08
GPR52Tier 11173846152TC1:173846152:T:C2.56E-08
ZBTB37Tier 1, MAGMA1173846152TC1:173846152:T:C2.56E-08
TNFSF4Tier 11173846152TC1:173846152:T:C2.56E-08
ANKRD45Tier 11173846152TC1:173846152:T:C2.56E-08
KLHL20Tier 11173846152TC1:173846152:T:C2.56E-08
PRDX6Tier 11173846152TC1:173846152:T:C2.56E-08
SERPINC1Tier 11173846152TC1:173846152:T:C2.56E-08
SLC9C2Tier 11173846152TC1:173846152:T:C2.56E-08
xxxxx
CACNA1ENearest1181676091GA1:181676091:G:A8.85E-07
xxxxx
VRK2Nearest257808420GA2:57808420:G:A9.49E-07
xxxxx
PLCL1Nearest2197882813AG2:197882813:A:G6.64E-07
xxxxx
HTTNearest43240118CT4:3240118:C:T8.03E-07
xxxxx
ECI2Nearest64336259TC6:4336259:T:C2.90E-07
xxxxx
BTN2A2Tier 1626239176AG6:26239176:A:G4.11E-09
TRIM38Tier 1626239176AG6:26239176:A:G4.11E-09
ZNF322Tier 1626239176AG6:26239176:A:G4.11E-09
ABT1Tier 1626239176AG6:26239176:A:G4.11E-09
HFETier 1626239176AG6:26239176:A:G4.11E-09
BTN3A3Tier 1626239176AG6:26239176:A:G4.11E-09
HMGN4Tier 1626239176AG6:26239176:A:G4.11E-09
H4C8MAGMA, Nearest626239176AG6:26239176:A:G4.11E-09
xxxxx
ZNF311MAGMA629016371GA6:29016371:G:A2.25E-06
xxxxx
FBXL4Tier 2697984426CCA6:97984426:C:CA4.85E-08
POU3F2Nearest (1 of 2)697984426CCA6:97984426:C:CA4.85E-08
MMS22LNearest (1 of 2)697984426CCA6:97984426:C:CA4.85E-08
xxxxx
MLLT10Nearest (1 of 2)1021748880AG10:21748880:A:G6.34E-07
DNAJC1Nearest (1 of 2)1021748880AG10:21748880:A:G6.34E-07
xxxxx
SOX6Nearest1116217844CG11:16217844:C:G1.08E-07
xxxxx
SLC2A14Nearest127860921TA12:7860921:T:A5.79E-07
xxxxx
SUDS3Tier 1, MAGMA12118202773CTTTTTTTTTTTTTC12:118202773:CTTTTTTTTTTTTT:C1.64E-07
PEBP1Tier 112118202773CTTTTTTTTTTTTTC12:118202773:CTTTTTTTTTTTTT:C1.64E-07
VSIG10Tier 112118202773CTTTTTTTTTTTTTC12:118202773:CTTTTTTTTTTTTT:C1.64E-07
TAOK3Nearest, MAGMA12118202773CTTTTTTTTTTTTTC12:118202773:CTTTTTTTTTTTTT:C1.64E-07
xxxxx
DNAH10MAGMA12123924955GA12:123924955:G:A2.43E-07
ZNF664MAGMA12123924955GA12:123924955:G:A2.43E-07
CCDC92MAGMA12123924955GA12:123924955:G:A2.43E-07
xxxxx
OLFM4Nearest, Tier 21353194927GTG13:53194927:GT:G1.16E-07
xxxxx
PCDH17Nearest1358456743TC13:58456743:T:C9.42E-07
xxxxx
CCPG1Tier 21554866724AG15:54866724:A:G7.62E-09
UNC13CNearest1554866724AG15:54866724:A:G7.62E-09
xxxxx
SHISA6Nearest1711325637GC17:11325637:G:C8.26E-08
xxxxx
CA10Tier 1, Nearest1752183006CT17:52183006:C:T2.11E-09
xxxxx
DCCNearest1853232948CT18:53232948:C:T2.48E-07
xxxxx
CDK5RAP1Nearest2033363039GA20:33363039:G:A5.41E-07
xxxxx
CSE1LTier 1, MAGMA2048914387TTA20:48914387:T:TA9.51E-12
ARFGEF2Tier 1, MAGMA, Nearest2048914387TTA20:48914387:T:TA9.51E-12
DDX27Tier 12048914387TTA20:48914387:T:TA9.51E-12
STAU1Tier 1, MAGMA2048914387TTA20:48914387:T:TA9.51E-12
ZNFX1Tier 12048914387TTA20:48914387:T:TA9.51E-12
B4GALT5Tier 12048914387TTA20:48914387:T:TA9.51E-12
PTGISTier 12048914387TTA20:48914387:T:TA9.51E-12
xxxxx
MRPL39Nearest2125487862AAG21:25487862:A:AG5.10E-07

LRRC7
NEGR1
LRRIQ3
ZNF644
RABGAP1L
DARS2
RC3H1
GPR52
ZBTB37
TNFSF4
ANKRD45
KLHL20
PRDX6
SERPINC1
SLC9C2
CACNA1E
VRK2
PLCL1
HTT
ECI2
BTN2A2
TRIM38
ZNF322
ABT1
HFE
BTN3A3
HMGN4
H4C8
ZNF311
FBXL4
POU3F2
MMS22L
MLLT10
DNAJC1
SOX6
SLC2A14
SUDS3
PEBP1
VSIG10
TAOK3
DNAH10
ZNF664
CCDC92
OLFM4
PCDH17
CCPG1
UNC13C
SHISA6
CA10
DCC
CDK5RAP1
CSE1L
ARFGEF2
DDX27
STAU1
ZNFX1
B4GALT5
PTGIS
MRPL39

Sources:
Tier 1 and 2 genes: Candidate genes document
MAGMA genes: Supplementary table 4
Top 25 loci: Supplementary table 3, with nearest gene to each lead variant found with LocusZoom. (Can also be done on UCSC Genome Browser.)

1. Go to genome browser at https://genome.ucsc.edu/cgi-bin/hgTracks

2. Press the "Hide All" button to turn off the unneeded tracks:
1761695087997.png

3. In the dropdown for GENCODE, select "pack", then press "Refresh":
1761695533139.png

4. In the search box at the top, enter a chromosomal position in the format CHROMOSOME: POSITION, then press "Search". For example, 20:48914387 is the location of the most significant variant in DecodeME:
1761695683292.png

5. Press the buttons to the right of "Zoom out" at the top of the page until you see protein coding genes show up in the main field of view, which will be colored blue. For example, I had to press the 10x zoom out button five times to see that ARFGEF2 is the nearest gene to the most significant variant:
1761696037949.png
 
Last edited:
I mainly compiled the list of candidate genes above to check if any of those genes are ranked highly for rare variant associations in the Genebass browser, which includes data from a study of all the phenotypes in the UK BioBank, including ME/CFS.

The rare variant study/dataset is discussed more here: Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes, 2022, Karczewski et al

So checking all the different p-values in Genebass (SKATO p, SKAT p, and burden p for missense, pLoF, and synonymous variant associations making 9 p-values per gene) for all 59 of the above genes, these are all the genes where the p-value was below .05 in any of the tests:
GeneP-valueVariant typeStatistical test
VSIG102.53E-05synonymousBurden
VSIG103.38E-05synonymousSKATO
ANKRD450.00099synonymousSKAT
MMS22L0.00109pLoFSKAT
MMS22L0.00118pLoFSKATO
ANKRD450.00162synonymousSKATO
MMS22L0.00450pLoFBurden
ANKRD450.00585pLoFSKAT
UNC13C0.00914synonymousBurden
ANKRD450.01026pLoFSKATO
DCC0.01077pLoFSKAT
SLC2A140.01204missense|LCBurden
DCC0.01345pLoFSKATO
UNC13C0.01723synonymousSKATO
SLC2A140.01848missense|LCSKATO
TAOK30.02115pLoFSKAT
CDK5RAP10.02116synonymousSKAT
CDK5RAP10.02325synonymousSKATO
PLCL10.02359synonymousSKAT
ANKRD450.02551synonymousBurden
VSIG100.02558synonymousSKAT
RABGAP1L0.02674pLoFSKAT
ZBTB370.02692pLoFSKAT
TNFSF40.02708missense|LCSKAT
DCC0.02809pLoFBurden
CDK5RAP10.02935synonymousBurden
PLCL10.03497synonymousSKATO
ZNF6440.03584synonymousBurden
TAOK30.03714pLoFSKATO
SLC2A140.03919missense|LCSKAT
ZNF6440.03954synonymousSKAT
VSIG100.03994pLoFSKAT
ZBTB370.04032pLoFSKATO
ZNF6440.04152synonymousSKATO
TNFSF40.04163synonymousSKAT
SUDS30.04191synonymousSKAT
HTT0.04427pLoFSKAT
RABGAP1L0.04841pLoFSKATO

So just maybe there's sign of a real signal for VSIG10, but otherwise, the p-values aren't really low enough to give much confidence for these genes considering how many different gene-variant type associations are being considered (59 genes x 3 variant types = 177 tests. Three times as many, 531, if considering different statistical test types on the same gene as different tests, e.g. SKAT vs SKATO).
 
Last edited:
Re-reading the preprint paper, this non-scientist is struggling with how imputation was done (pooled cases+controls, versus separately ... if cases have unusual haplotypes surely separate imputation avoids forcing imputation from, essentially, population haplotypes ...).

Is it possible to tell which variants/loci were imputed - can't immediately find it in the paper/supplementary? I am sort-of thinking an imputed variant/locus is a less-reliable signal than a chip-detected SNP.
 
@forestglip, would it be worth making a locked topic with your list of main candidate genes, which only gets added to if other candidates appear?

I've once or twice found myself reading things that might be related, then having to spend ages trying to remember whereabouts in a very long thread the list of genes last appeared. It would make it easier to find for anyone who's got time and energy to do a bit of digging.

Calling it something simple like 'Main candidate genes list 2025' would make it more searchable too. I always struggle to relocate topics about papers with long titles, as it's hard to remember one word that was definitely in it.
 
Re-reading the preprint paper, this non-scientist is struggling with how imputation was done (pooled cases+controls, versus separately ... if cases have unusual haplotypes surely separate imputation avoids forcing imputation from, essentially, population haplotypes ...).
My shallow understanding is that imputation is done per person in a study, not pooling any participants together, by comparing that person's DNA to a whole genome reference panel, like 1000 Genomes. They look at the pattern of SNPs that they were actually able to test in a person, and see how that pattern compares to the whole genomes in the reference panel to see if a given pattern is associated with high confidence with other untested SNPs.

I wouldn't think it would cause issues for the reference panel to be healthy unlike the cases being imputed, but I'm not sure.

Whether SNPs were imputed in DecodeME is in the summary statistics files, where 1 means the SNP was actually measured, and anything less is a score for confidence about the imputation. For example:
Of the 8 hits, only 1 was measured, but the others have a high INFO_SCORE suggestion that their distributions follow Hardy-Weinberg Equilibrium.

View attachment 27867


I suppose it's based on the strong correlations between SNPs that you only need to know a few to be pretty certain what the others are. But I was quite surprised that the actually measured SNPs are so low (around 5% of the total, apparently).
 
Calling it something simple like 'Main candidate genes list 2025' would make it more searchable too. I always struggle to relocate topics about papers with long titles, as it's hard to remember one word that was definitely in it.
You mean specifically genes that the DecodeME data suggests might be interesting or based on any source? Including any ME/CFS studies would be a lot more of a challenge depending on what sources are considered.

It might be useful. Some other options are linking to that gene list post from the first post of this thread, or bookmarking that post with the forum bookmark tool. What do you think? I'll also bring it up with the other mods.
 
Whether SNPs were imputed in DecodeME is in the summary statistics files, where 1 means the SNP was actually measured,

Brilliant - I never would have found that.

I wouldn't think it would cause issues for the reference panel to be healthy unlike the cases being imputed, but I'm not sure.
For example, if ALL pathological ME variants are absent on the UKB Axiom array, how sure can we be that imputation against the general population will cause them to be associated with pwME?
 
For example, if ALL pathological ME variants are absent on the UKB Axiom array, how sure can we be that imputation against the general population will cause them to be associated with pwME?
I don't think I understand imputation well enough to answer. In any case, we know there is a lot of the genome data missing in DecodeME. Imputation can only get you so far, and some of the imputed variants might be wrong.

A whole genome sequencing study, AKA SequenceME, would be very valuable so that we could actually measure all the positions instead of making educated guesses about all the non-measured positions.
 
My impression is that 'imputation' is used to cover more than one inferential process, at least one involving individual SNP profiles (imputing intervening SNPs) and another making use of population weightings of SNP forms (imputing risk genes from minihaplotypes of SNPs).
 
On the TNFSF4 (also known as OX40L):

a paper on lupus (SLE) quoted by Forestglip said that an up-regulation of TNFSF4/OX40L (the OX40 ligand) predisposes to lupus. I think TNFSF4 activates CD4+ t-cells:
We hypothesize that increased expression of TNFSF4 predisposes to SLE either by quantitatively augmenting T cell–APC interaction or by influencing the functional consequences of T cell activation via TNFRSF4.
Two tumor necrosis factor (TNF) superfamily members located within intervals showing genetic linkage with SLE are TNFSF4 (also known as OX40L; 1q25), which is expressed on activated antigen-presenting cells (APCs)7,8 and vascular endothelial cells9, and also its unique receptor, TNFRSF4 (also known as OX40; 1p36), which is primarily expressed on activated CD4+ T cells10.

TNFSF4 produces a potent co-stimulatory signal for activated CD4+ T cells after engagement of TNFRSF4 (ref. 11).

Using both a family-based and a case-control study design, we show that the upstream region of TNFSF4 contains a single risk haplotype for SLE, which is correlated with increased expression of both cell-surface TNFSF4 and the TNFSF4 transcript.
Another quoted paper suggests that blocking the ligand and receptor can improve lupus:
Blockade of OX40/OX40L signaling using anti-OX40L alleviates murine lupus nephritis (2024)



BUT DecodeME seems to be suggesting the opposite for ME/CFS - a decreased expression of TNFSF4, at least in some defined tissues
The DecodeME paper says that the ME/CFS variants here are associated with decreased expression of TNFSF4 in the lung, skin of sun exposed lower leg, and thyroid.


Members have noted that some drugs that reduce expression of TNFSF4 are coming to market and there was a suggestion that it could be tried in ME/CFS.
A few OX40L monoclonals are coming to market soon:
This particular monoclonal is made by Sanofi, who recently agreed to let Scheibenbogen trial their CD38 inhibitor in ME/CFS. So if there is a rationale for trying their OX40 monoclonal in ME they might well be amenable.

If I have understood things correctly, (and I may well not have, in which case, let me know), knocking down expression of TNFSF4 in people with ME/CFS would be very unlikely to help. The genetic variant found by DecodeME that reduces TNFSF4 expression might only be relevant to increasing the risk of developing ME/CFS, so an OXO40L monoclonal that reduces TNFSF4 expression might not make symptoms worse, but there doesn't seem to be a good rationale to try it.
?
 
Last edited:
The genetic variant found by DecodeME that reduces TNFSF4 expression might only be relevant to increasing the risk of developing ME/CFS, so it might not make symptoms worse, but there doesn't seem to be a good rationale to try it.
I think there are often situations where the same variant can be associated with increases or decreases of gene expression depending on the tissue, or maybe depending on other factors like age.

For example, a variant from a past study:
So the only significant finding in terms of genetic mutations was a SNP which affects the expression of CCK (aka cholecystokinin). The database they used seems to say that this SNP decreases expression of CCK in cultured fibroblasts but increases expression in colon cells.

Apart from that, the gene expression database isn't comprehensive of every tissue that might be affected, both because there might be too low of statistical power and because not every possible cell type/tissue type is included in the database.

So just noting that it's possible this variant might lead to increased expression where it matters for ME/CFS. Or of course you might be right and it could be a dead end.
 
If I have understood things correctly, (and I may well not have, in which case, let me know), knocking down expression of TNFSF4 in people with ME/CFS would be very unlikely to help. The genetic variant found by DecodeME that reduces TNFSF4 expression might only be relevant to increasing the risk of developing ME/CFS, so it might not make symptoms worse, but there doesn't seem to be a good rationale to try it.
?
So I may be misremembering but the Sanofi OX40L monoclonal was claimed by the company to have a t cell calming/soothing effect, and JE and others have theorised T Cells may be central to the mechinism of MECFS. So I was going from that and the connections I saw iirc.

But i didnt realise it was down not up in DecodeME.
 
I don't think I understand imputation well enough to answer.

I have spent an hour or two grappling with phasing and imputation, but realize the task is very large and complex. It just bothers me that most of the SNPs in the preprint are inferred by fiendish arithmetic.

As you say, SequenceME will be a blessing. Is there currently ANY whole genome data for ME? Is SequenceME the only WGS project on the horizon?
 
Back
Top Bottom