What would successful brainstorming about ME/CFS genes look like?

Great start @Sasha! Usually I’ll start at gene cards as well, and then the first thing I’ll do is scroll allllll the way down to the tissue expression chart that looks something like this:

IMG_9784.png
That’s a compilation from various databases looking at where RNA for that gene has been detected in the human body. Detected RNA doesn’t always mean that the protein is being produced, but it can be a good starting point for understanding where it might be playing important roles. The Human Protein Atlas is also a good resource, as it can sometimes tell you where the protein has actually been detected (i.e. where the gene is actually functioning). That will give you a good idea of whether it’s a gene that is really only functioning in one organ/context, or if it’s a gene that has just been characterized extensively in only one context.

The summary sections of gene cards are usually a mishmash of one-line summaries from various papers that mention the gene name is some context, so read them as such. There is no coherent narrative being presented here. Unfortunately gene cards is not particularly forthcoming about the sources that these summaries are drawing from. If something sticks out to me, I’ll usually just Google the terms and the gene name and see what comes up, like “ERK pathway SYNGAP1” or “Autism SYNGAP1”.

Usually studies related to those more technical terms will be cell biology studies where they knocked out the gene and found that the cell was unable to do XYZ. Sometimes you’ll find knockout mouse models for exploring more complicated organ systems. The ones related to human disease names are going to be genetics studies that provide no mechanistic insight. If the paper is pure technical gobbledygook you can skip to the discussion, just keeping in mind that you’re reading someone’s attempt to weave a narrative and it may present a shinier story than what the data actually shows.

I agree with @hotblack that AI might be good for defining basic concepts—I’d expect it to at least point you in the right direction for what the ERK pathway is, for example. But if you ask it “how might SYNGAP1 relate to ME/CFS?” it will just draw from whatever half baked theories people have proposed about HPA axis or neural connectivity, probably.

Once you do this for enough genes you’ll probably see certain words and phrases start to come up that paint a story in your head. And of course that process of pattern recognition will often be driven by confirmation bias, but there’s not much you can do about it for these purposes. It’s still a useful exercise just for driving some learning!
 
Another good starting point is the ME/CFS paper you get a gene from in the first place.

The authors are likely to have done at least a scan of the literature to see how the significant genes might fit into the explanation for ME/CFS. So they did a bit of the work for you, and you can check if the things they linked make sense and continue researching from that starting idea.

For example, from the Zhang paper:
As highlighted in our network analysis, ME/CFS genes participate in biological pathways associated with synaptic function (M20; Fig. 4B and 4D). SYNGAP1, another gene driving downregulation of the ME/CFS genes in cytotoxic CD4 T cells (Fig. 4G) and a member of M20, is involved in synaptic signaling and plasticity, essential for brain function with mutations linked to neurodevelopmental and psychiatric disorders53. SYNGAP1’s role in synaptic signaling highlights its potential connection to neurological symptoms in ME/CFS, offering therapeutic potential in neuroprotective strategies54.

And they link to a couple papers that they base these claims on (the 53 and 54).
 
Except that the great thing about gene linkages is that they are always the root cause.
Isn't one of the problems going to be that genes are sometimes inherited in strings of genes rather than as individual units, so if you have a disease association with a particular SNP, it might not be the gene containing that SNP wot dunnit? But the gene next to it, or a bit further down the line?

Do adjacent genes tend to produce proteins that do related stuff, such as affect sleep, for instance?
 
But the gene next to it, or a bit further down the line?

You're getting a bit technical here, @Sasha. This is what imputation protocols are for.God knows how they work.

Do adjacent genes tend to produce proteins that do related stuff, such as affect sleep, for instance?

Sometimes, sometimes not. For instance, in the MHC you have Class I genes at one end, whose proteins bind to CD8 T cells, and Class II genes at the other, whose genes bind to CD4 T cells and Class III genes (someone boobed here) in the middle that do all sorts of unrelated jobs making cytokines, complement and whatever. It is a bit like a cutlery draw with knives on the right, forks on the left, and the middle full of corkscrews, garlic crushers and rubber bands.

Lots of genes evolved by one gene getting duplicated, maybe through a slightly off target chromosome arm switching in a germline cell, and then the duplicate mutating to do a similar but slightly different job from the original. That makes sense if you want a whole row of receptors all with the same anchor tail and basic shape but each with a different 'receiving' end tofit something different.
 
Any chance you can explain the difference in the three columns and why they're not the same? (RNAseq, SAGE, and microarray)
It’s just three different methods to detect RNA from tissue—there are technical variations in how the RNA is captured, whether you need to know the sequence ahead of time or if it’s an unbiased discovery, how prone they are to issues that skew exact quantification of transcripts, the range of total transcripts that can be reliably detected etc.

SAGE and microarray are two older methods that were the precursor to high-throughput RNA-seq.

Unfortunately there’s no hard and fast rule for which one should be considered “ground truth” above the others when they disagree, so I usually try to look at what they're all telling me cumulatively.

Some of the discordance will also be because one method just didn’t have data available for that gene/tissue. That’s the most likely explanation for cases where one or two methods show high expression and another shows nothing at all
 
Last edited:
Back
Top Bottom