ME/CFS Bioinformatics Repository

tralfamadorian97 · May 21, 2026

Since the DecodeME preprint was released, I have been teaching myself bioinformatics by analyzing the summary statistics from DecodeME and other GWAS. I’ve been publishing my code on GitHub here, and documenting my results here.

While I don’t have dramatic headline results, I still thought that this work would be of interest to Science4ME forum members, because there are a few analyses that supplement recent forum discussions. For example, I ran gene-level H-MAGMA on the DecodeME summary statistics.

If anyone wants to contribute, I will happily accept fixes or additions to the documentation or codebase. For minor changes, you can just create a GitHub pull request. For major additions, it is probably better to first create a GitHub issue with a brief proposal, which we can discuss.
@forestglip already found the repo and has made some very helpful contributions

forestglip · May 21, 2026

It's been incredible watching this project being developed. I stumbled across it a few months ago, and it was immediately obvious tralfamadorian97 is remarkably motivated, organized, and intelligent.

There are a whole lot of different results and even lessons about bioinformatics tools in the documentation which I found very interesting to explore.

hotblack · May 21, 2026

Very impressive! Looks like a great resource, thanks for all your work @tralfamadorian97 and for sharing it.

ME/CFS Science Blog · May 21, 2026

Impressive @tralfamadorian97 , thanks.

Do check out Paolo Maccallini's meta-analysis which found some stronger results than DecodeME alone:

Preprint Thread 'Biological Insights from Genome-Wide Association Studies and Whole Genome Sequencing of [ME/CFS], 2026, Maccallini et al'

May 15, 2026

Biological Insights from Genome-Wide Association Studies and Whole Genome Sequencing of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome

Maccallini, Paolo

Abstract
Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a debilitating disorder of poorly understood etiology.

We performed a meta-analysis of two European-ancestry ME/CFS genome-wide association studies (GWAS) with no overlap in subjects, DecodeME and Million Veteran Program, comprising a total of 19,470 cases and 699,111 controls. Post-GWAS analyses investigated the association between ME/CFS and...

I'm particularly interesting in the cell type eccentric medium spinal neuron that was significant in the meta-analysis:

Thread 'Eccentric medium spiny neuron (eMSN)'

May 18, 2026

Eccentric medium spiny neuron (eMSN) sometimes called eccentric spiny projection neurons (eSPNs) is the cell type that came up in Paolo Maccallini’s genetic analysis for ME/CFS. He used a meta-anaysis of DecodeME and the Million Veteran Program (Neff = 74,219). Using a tool called MAGMA, Paolo could match the DNA results with gene expression data from the human brain atlas published by Siletti et al. 2023. More about the methodology can be discussed in the thread on Paolo’s paper...

We've also been trying to use tools such as FLAMES to help identify the causal genes but havent' really managed to make it work. Perhaps you might be able to do it?

Thread 'Running FLAMES on DecodeME data'

Dec 11, 2025

Hi all,

Something that has been in the back of my head since the DecodeME data was released was running the newer FLAMES analysis:

GitHub - Marijn-Schipper/FLAMES: FLAMES: Accurate gene prioritization in GWAS loci

FLAMES: Accurate gene prioritization in GWAS loci. Contribute to Marijn-Schipper/FLAMES development by creating an account on GitHub.

github.com

Somewhat discussed here: https://www.s4me.info/threads/initi...2025-decodeme-collaboration.45490/post-636145

My intention with this thread is that if enough of us get our heads together, I think we me able to run this? I wanted to make a separate thread as I believe if we start working on this it...

tralfamadorian97 · May 21, 2026

ME/CFS Science Blog said:
Do check out Paolo Maccallini's meta-analysis which found some stronger results than DecodeME alone:

Yes, I did see Paolo's paper. Impressive. I'll read it in more detail when I get a chance.

ME/CFS Science Blog said:
We've also been trying to use tools such as FLAMES to help identify the causal genes but havent' really managed to make it work. Perhaps you might be able to do it?

I've create a GitHub Issue to track this here. This might take a while, but at the moment I don't see any insurmountable barriers to running this.

forestglip · May 22, 2026

ME/CFS Science Blog said:
I'm particularly interesting in the cell type eccentric medium spinal neuron that was significant in the meta-analysis:

See this page for the results of MAGMA using DecodeME sumstats on a brain cell-type dataset, like Paolo did. The same dataset as one of the two Paolo used actually: Siletti 2023. And the most significant finding was eccentric MSNs.

I don't know the details of the cell-type data, but I think it might be slightly different in this analysis because Paolo's analysis gives specific brain regions, while this seems to be more focused on cell-type in general. Maybe tralfamadorian can clarify. It looks more significant in this analysis.

ChronicallyOverIt · May 22, 2026

@tralfamadorian97 if I remember correctly I got the pops analysis running, I can put a pull request for you to check that if you want.

The fine mapping is far more difficult than it seems, using Susie-r is one thing but if I remember correctly you need a large file, possibly many gb’s to make this, called a linkage disequilibrium…It’s been a while, unfortunately I’ve been in a bit of a down trend since the beginning of the year so I just haven’t had the clarity of mind to learn something so complex.

ME/CFS Science Blog · May 22, 2026

forestglip said:
See this page for the results of MAGMA using DecodeME sumstats on a brain cell-type dataset, like Paolo did. The same dataset as one of the two Paolo used actually: Siletti 2023. And the most significant finding was eccentric MSNs.

Thanks, I see this as confirmation that the data really points to eMSN and that it wasn't a fluke or error from the meta-analysis Paolo did.

ME/CFS Science Blog · May 22, 2026

forestglip said:
See this page for the results of MAGMA using DecodeME sumstats on a brain cell-type dataset, like Paolo did.

Think it's worth looking into these other cell types as well. Seems like there's a link to splatter cells as well which are also poorly understood.

I also have a question @tralfamadorian97: does the signal for Amygdala excitatory (Cluster419) point to the amygdala's intercalated cells. Cause I read that they have the same developmental origin as the eMSN, with some saying they "represent a ventral extension of the dorsal striatum."

tralfamadorian97 · May 22, 2026

ChronicallyOverIt said:
The fine mapping is far more difficult than it seems, using Susie-r is one thing but if I remember correctly you need a large file, possibly many gb’s to make this, called a linkage disequilibrium…It’s been a while, unfortunately I’ve been in a bit of a down trend since the beginning of the year so I just haven’t had the clarity of mind to learn something so complex.

Yes, I was able to run SUSIE-R. Some example results are here. I did have to to download the linkage disequilibrium (LD) matrices, which as you say are quite large. I got the LD matrices from here.

Overall, I found that SUSIE produced rather diffuse credible sets. That is, it returned about 50-100 possible causal variants, and could not narrow things down beyond this. I believe this is just a sample-size issue: I think you often need >50k cases to get narrow credible sets.

Nevertheless, these diffuse credible sets may still be useful. When I get around to it, my plan is to feed them into FLAMES.

Hope the downward trend improves, @ChronicallyOverIt !

tralfamadorian97 · May 22, 2026

ME/CFS Science Blog said:
I also have a question @tralfamadorian97: does the signal for Amygdala excitatory (Cluster419) point to the amygdala's intercalated cells. Cause I read that they have the same developmental origin as the eMSN, with some saying they "represent a ventral extension of the dorsal striatum."

Interesting question! Unfortunately, I don't know enough neuroscience to give a good answer. I do see that eccentric medium spiny neurons are labeled as originating mostly from the Amygdala. Their top three regions are: Amygdala: 75.9%, Cerebral cortex: 14.6%, Thalamus: 5.4%. It might be helpful to go back to the original Siletti paper and its supplementary material to understand more.

The HBA reference data I used for that MAGMA plot with the eccentric medium spiny neurons was prepared the authors of this paper. They preprocessed the raw Siletti 2023 scRNAseq data to produce a matrix suitable for consumption by MAGMA, as described in their github repo.

Yann04 · May 22, 2026

tralfamadorian97 said:
I do see that eccentric medium spiny neurons are labeled as originating mostly from the Amygdala. Their top three regions are: Amygdala: 75.9%, Cerebral cortex: 14.6%, Thalamus: 5.4%.

I thought they were from the striatum in the basal ganglia. I can’t remember where I read that but it seemed to be part of the definition.

forestglip · May 22, 2026

Yann04 said:
I thought they were from the striatum in the basal ganglia. I can’t remember where I read that but it seemed to be part of the definition.

The Human Protein Atlas at least reiterates high expression in amygdala, though I think they may be basing it on the same dataset used in the decodeme analysis:

The Eccentric medium spiny neurons include cells detected in the forebrain regions. This cluster is especially prominent in amygdala, basal ganglia and hypothalamus. As shown in Table 1, 114 genes show elevated expression in Eccentric medium spiny neuron compared to other brain cell clusters. Neurons of this cluster are inhibitory neurons, expressing both GAD1 and GAD2, where GAD2 is classified as group enriched in this cluster along with other interneuronal containing cell clusters. The Dopamine receptor D1 (DRD1)show enriched expression in this cluster, and Tyrosine hydroxylase (TH) show elevated expression.

Yann04 · May 22, 2026

forestglip said:
The Human Protein Atlas at least reiterates high expression in amygdala, though I think they may be basing it on the same dataset used in the decodeme analysis:

Ah thanks.

I had gotten the basal ganglia from here but it seems it doesn’t contradict but it does make it seem that’s their main location.

A type of central nervous neuron comprising more than 95% of the neurons in basal ganglia input structures, such as the caudate nucleus, putamen, nucleus accumbens and striatal districts in the olfactory tubercle. The cell body has a diameter in a range between 15 and 18 μm and gives rise to three to five primary dendrites that are aspiny proximally, but densely spiny beginning at about the first branch point.

Medium Spiny Neuron — Encyclopedia of Neuroscience (Springer)

I assumed eMSN are a subtype of MSN. And would also follow this distribution but maybe I’m wrong.

ChronicallyOverIt · May 22, 2026

tralfamadorian97 said:
Yes, I was able to run SUSIE-R. Some example results are here. I did have to to download the linkage disequilibrium (LD) matrices, which as you say are quite large. I got the LD matrices from here.

Overall, I found that SUSIE produced rather diffuse credible sets. That is, it returned about 50-100 possible causal variants, and could not narrow things down beyond this. I believe this is just a sample-size issue: I think you often need >50k cases to get narrow credible sets.

Nevertheless, these diffuse credible sets may still be useful. When I get around to it, my plan is to feed them into FLAMES.

Hope the downward trend improves, @ChronicallyOverIt !

It’s been a while but I think UK bio bank has LD matrices, but I don’t think they are public and are also in the range of petabytes. That would probably increase the accuracy though…

tralfamadorian97 · May 22, 2026

ChronicallyOverIt said:
It’s been a while but I think UK bio bank has LD matrices, but I don’t think they are public and are also in the range of petabytes. That would probably increase the accuracy though…

The matrices I used were actually generated from the UK Biobank by the Broad Institute. While the underlying individual-level UKBB data is not public, the generated LD matrices are public. There are more than 2000 matrices, each one for a different region of the genome. In total, they are several terabytes. Luckily, I only needed to download the matrices for the specific regions of the genome I wanted to fine-map with SUSIE. This was a few gigabytes, and so was manageable.

tralfamadorian97 · May 22, 2026

ChronicallyOverIt said:
@tralfamadorian97 if I remember correctly I got the pops analysis running, I can put a pull request for you to check that if you want.

Thanks for the offer! I'm trying to ensure that all the code in the repo runs through the task system as described here. The goal of the task system is to support reproducibility and iteration. The idea is that a user can just run a few lines of Python to reproduce any analysis. All auxiliary downloads are automatically performed.

I've sketched an outline of how I was planning to add POPs via the Task system here. If you are interested in doing any of those steps, let me know. However, I probably wouldn't recommend this as a first contribution, since it requires implementing new Task classes. Instead, if you are interested, I would suggest first getting a feel for how things work by using existing Task classes to analyze a new dataset.

ChronicallyOverIt · May 22, 2026

tralfamadorian97 said:
. Luckily, I only needed to download the matrices for the specific regions of the genome I wanted to fine-map with SUSIE. This was a few gigabytes, and so was manageable.

nice! This is where I got lost, I didn’t know which files to use! Awesome!

Ravn · May 24, 2026

ME/CFS Science Blog said:
Think it's worth looking into these other cell types as well. Seems like there's a link to splatter cells as well which are also poorly understood.

Poorly understood indeed. On my first search attempt with AI it told me “The term "splatter neurons" does not appear in the provided search results or in established neuroscience literature, and may be a misinterpretation, misspelling, or fictional concept.”

A second search attempt fared a little better and brought up the following papers (which I’ve only very partially skimmed). The first paper may be the same as mentioned elsewhere in the context of eccentric medium spiny neurons

I’ve no idea if any of this is even remotely relevant. I just liked the name splatter neurons, possibly because unlike ‘eccentric medium spiny neurons’ I can actually remember ‘splatter’ without having to keep looking it up

1. Transcriptomic diversity of cell types across the adult human brain, 2022, Siletti et al [preprint link as the published version is paywalled]

2. Schizophrenia-associated changes in neuronal subpopulations in the human midbrain, 2024, Alsema et al
Given how little is known about splatter neurons the controls may be as interesting as the schizophrenia here

As the human midbrain is largely terra incognita regarding the transcriptional landscape at cellular resolution, we first describe the neuronal populations observed in the ventral midbrain. These were validated at the protein level with immunohistochemical stainings

3. A Gene-Expression Based Comparison of Murine and Human Inhibitory Interneurons in the Cerebellar Cortex and Nuclei, 2025, Schilling
Appears to further subgroup splatter neurons into 3 clusters

ChronicallyOverIt · May 24, 2026

ChronicallyOverIt said:
nice! This is where I got lost, I didn’t know which files to use! Awesome!

@tralfamadorian97 just looking back on this, how do you decide which files to use?

ME/CFS Bioinformatics Repository

Established Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Preprint Thread 'Biological Insights from Genome-Wide Association Studies and Whole Genome Sequencing of [ME/CFS], 2026, Maccallini et al'

Established Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Attachments

Established Member (Voting Rights)

Established Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Established Member (Voting Rights)

Established Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)