Preprint Cluster analysis of ME/CFS symptoms in DecodeME reveals two subgroups and a link to onset type, 2026, St-Jean et al

FOr me the absence of a different genetic signature is the key thing. I am also sceptical about the significance or reliability of the history of infectious onset but I think it is very interesting that BTN2A1, in particular, is not differentially linked either to severity group or infectious history.

I agree with others that if anything this again suggests that ME/CFS is rather homogeneous mechanistically. As for the lack of different in gene variants for sexes, it seems to point to everyone having the same route to disease. And yes, it suggests that the gene variants are telling us about critical ppints in regulatory failure rather than just downstream tissue sensitivity. If genes were linked to how susceptible a tissue like brain is to bombardment with signals then they should link even more to severe cases.

All in all I think it tends to be reassuring - that there is probably only one major route to ME/CFS to find and the gene variants are pointing at critical pathway points.
 
I think severity groups are meaningful. I would not call them subtypes - I don't think that makes a lot of sense either clinically or scientifically. But severity levels are relevant to unpicking mechanism, especially if the ppopulation is bimodal in this respect (if that can be reliably said).
 
FOr me the absence of a different genetic signature is the key thing.
Isn’t that too soon to say, though? The current analysis might be underpowered, it was 8k vs 10k instead of 15k vs 250k or whatever DecodeME was.

The paper also talks about three suggestive differences in genes:
In contrast, the GWAS conducted as part of the present analysis did not identify any genome-wide significant subtype-associated variants, and the three suggestive associations should therefore be interpreted cautiously, particularly as approximately one would be expected to be false positive at the threshold used.
Nevertheless, the proximity of these suggestive loci to genes implicated in circadian regulation, inflammatory signalling, and sensory or pain processing is notable in relation to the clinical profile of the HSBC, which was marked by greater illness severity, sleep-related symptoms, sensory sensitivities, pain-associated comorbidities, and higher prevalence of infectious or uncertain onset.
Taken together, the genetic findings do not validate the symptom clusters directly, but they provide a plausible biological frame for interpreting them and suggest that symptom burden, onset type, sex, and comorbidity profile may be important stratification variables in future genetic and mechanistic studies of ME/CFS.
 
The current analysis might be underpowered, it was 8k vs 10k instead of 15k vs 250k or whatever DecodeME was.
The methods say the GWAS cohorts were smaller:
Genome-Wide Association Study
Genetic data from DecodeME participants was extracted from the jointly imputed case-control dataset for GWAS-1 (4). For this analysis, we retained only participants who were included in DecodeME GWAS-1 and had matching questionnaire data, leading to a dataset with 15,328 participants. Of these, 2,401 participants were allocated to the LSBC, and 3,264 to the HSBC.
Though I don't understand what those smaller numbers are based on, and why not all 15,328 would be included.
 
That goes beyond the evidence due to the limited sample sizes when using subgroups. We know from other genetic studies that reaching a critical mass is crucial to be able to find the relevant genes. We need larger cohorts to say anything more definitively.
They have a cohort of 10,000 patients. Leading immunologists like Selin and Kumar who work with a research question (which is key if you want to find out something) make proof of concept studies with 6 patients and find out extremely interesting and valuable things.

Blaming limited severity subgroup sizes at this scale sounds like a particle physicist arguing that the only reason their field hasn't made a major breakthrough since the 1970s is because nobody built them a bigger Large Hadron Collider.
 
They have a cohort of 10,000 patients. Leading immunologists like Selin and Kumar who work with a research question (which is key if you want to find out something) make proof of concept studies with 6 patients and find out extremely interesting and valuable things.
In a GWAS, they're doing multiple test correction for around a million tests (about that many independent loci on the genome). If Selin and Kumar ran a million tests with 6 individuals and corrected the p-values, there would almost certainly not be any significant findings.

It's the reason GWAS sample sizes are usually at least in the tens of thousands, and sometimes in the millions.
 
On a very brief skim through (all I can manage these days) this seems like useful work; it's good to see further analyses of the DecodeME questionnaire data. Symptomatic burden and self-reported severity do look strongly associated but not identical, with higher severity overall being more likely to fall into the higher symptom burden cluster; but it's not one-to-one - the illness course also seems to track well, with the higher burden cluster enriched for deterioration and improvement being more common in the lower burden one.

Perhaps a median or mean symptom count by severity category analysis would be useful, e.g. some histograms & density plots of symptom count by cluster and severity, or a symptom-count adjusted enrichment analysis to see if any subtype remains after removing the burden dimension? The adjusted regression gives higher odds for infectious vs. non, but the cluster table (Suppl. Tab. 2) has an excess of unknown-onset and lower non-infectious onset in the HSBC.

The cluster separation does seem modest, more of a broad symptom-burden gradient. I think the authors' interpretation that these are "clinically meaningful subtypes, potentially reflecting distinct biological mechanisms" is too strong; if such an analysis had produced one cluster that was, say, enriched for pain and another cluster for orthostatic symptoms, that would be a different matter.
 
"Nevertheless, the proximity of these suggestive loci to genes implicated in circadian regulation, inflammatory signalling, and sensory or pain processing is notable in relation to the clinical profile of the HSBC, which was marked by greater illness severity, sleep-related symptoms, sensory sensitivities, pain-associated comorbidities"

These are further indications of the mecanism. I don’t think this means that the disease does not fluctuate and that there would be two very distinct groups, as variants would have emerged significantly. Either protective in the mild group, or aggravating in the other, or both.
 
First, they separated the participants into two groups based entirely on symptom severity. They found that those in the more severe group were about 1.24 times more likely to have had an infectious trigger at the start of their ME/CFS compared to those with lower symptom burden.
Symptomatic burden and self-reported severity do look strongly associated but not identical, with higher severity overall being more likely to fall into the higher symptom burden cluster; but it's not one-to-one
Perhaps a median or mean symptom count by severity category analysis would be useful, e.g. some histograms & density plots of symptom count by cluster and severity, or a symptom-count adjusted enrichment analysis to see if any subtype remains after removing the burden dimension?
I agree with the notes of caution about claiming subtypes; it doesn't sound as though much was found to support that idea.

I want to make sure that everyone understands that the symptom burden talked about here is the number of symptoms, as Nightsong explains. So, they didn't separate the groups based on symptom severity, they separated them on symptom number. The people reporting a higher number of symptoms also tended to report a higher illness severity but there was plenty of variation.

I'd also like to throw in the usual comments about the uncertainties related to self-selection of participants and self-reporting. It's conceivable that women have spent more time in support groups chatting with others and might therefore know about the co-morbidities that get talked about, things like POTS and so be more likely to report them. Probably parents of young people with ME/CFS might tend to be across the ME/CFS literature and Facebook chat groups and assist their young person to tick more symptom boxes.

I haven't looked at the symptom descriptions, but it's possible that there may be more options relevant to women that more or less exclude men e.g. period pain, urinary tract infections, that could skew the symptom count to female.
 
Last edited:
Isn’t that too soon to say, though? The current analysis might be underpowered, it was 8k vs 10k instead of 15k vs 250k or whatever DecodeME was.

I have not had time to read through the paper but my thought was that it made the original 8 gene links look more robust in terms of critical pathways. There will no doubt be some other variants picked up with big enough numbers that reflect end organ susceptibilities. There might be some subgroup discriminating variants linked to process too but at least we are not seeing major subgroup separation.
 
In a GWAS, they're doing multiple test correction for around a million tests (about that many independent loci on the genome). If Selin and Kumar ran a million tests with 6 individuals and corrected the p-values, there would almost certainly not be any significant findings.

It's the reason GWAS sample sizes are usually at least in the tens of thousands, and sometimes in the millions.
In academic writing, the first rule I taught my students was basic research design: you must formulate a relevant question that you can actually answer within the limits of your time, scope, and resources.

What @Utsikt seems to be suggesting is that because these researchers failed at this foundational step – asking a question too large for their 10,000-patient cohort – we should reward the poor study design by giving them even more funding just so they can finally answer it.
 
Back
Top Bottom