Preprint Dissecting the genetic complexity of myalgic encephalomyelitis/chronic fatigue syndrome via deep learning-powered genome analysis, 2025, Zhang+

Can we make a table of the proteins and functions for these and maybe check further down for sister genes where they show up?

DNMT3A. DNA methyltransferase 3 alpha Epigenetic DNA methylation -gene expression control
ADCY10 adenylyl cyclase 10. Formation of cAMP
PPP2R2A. Protein Phosphatase 2 Regulatory Subunit Balpha. Cell cycle
NLGN2. Neuroligin 2. Synapse formation
LEP Leptin Weight control/appetite
SYNGAP1. Synaptic Ras GTPase Activating Protein 1. Synapses MAPkinase signalling
AHCYL2. Adenosylhomocysteinase Like 2. Brain signalling
NLGN1. Neuroligin 1 Synapse formation
DLGAP4. DLG Associated Protein 4. Synapses
HDAC1. Histone Deacetylase 1. Regulation of gene transcription
AMPD2 adenosine monophosphate deaminase 2
AHCYL1. Adenosylhomocysteinase Like 1. Anti-inflammatory cytokine production
SHARPIN. SHANK Associated RH Domain Interactor. Signalling in auto inflammation. Synapses
NME2. NME/NM23 Nucleoside Diphosphate Kinase 2. DNA transcription. Risk factor for EBV-associated lymphoma
NME1-NME2. NME1-NME2 Readthrough. DNA transcription
CACNA2D3. Linked to brain development and autism TCR signalling
NME3. NME/NM23 Nucleoside Diphosphate Kinase 3. Goes with NME1
ZC3H13. NME/NM23 Nucleoside Diphosphate Kinase 3. RNA splicing
CAMK2A. Calcium/Calmodulin Dependent Protein Kinase II Alpha. Synapses on dendrites in brain
PIK3CA. Phosphatidylinositol-4,5-Bisphosphate 3-Kinase Catalytic Subunit Alpha. Insulin responses, brain development
MAX. MYC Associated Factor X. DNA transcription
HLA-C HLA-C (MHC I) CD8 T cell receptor and NK cell receptor recognition events
ACE. Angiotensin I Converting Enzyme

We should keep a wide purview. While many of these genes are flagged as neural and relating to synaptic function, they can have important non-canonical roles. The potential link with autism development is fascinating but autism spectrum disorder has other features beyond the neurodevelopmental, eg gastrointestinal dysfunction. So while their effect on synapse formation and maintenance would be important in the primary neurodevelopmental abnormalities we observe, the very common comorbid problems with gut function might be more due to epithelial tight junctions and barrier integrity than with the gut's neural connections.

Tons of us have OI. Does anyone know what synapses would have to be messed up for OI to occur? Are any OI-type genes showing up in the Zhang results?

We should also consider that genes identified in OI might relate more directly to vascular (esp. endothelial cell) function rather than neuronal synapses. Eg NLGN1 and NLGN2 are expressed in vascular endothelial cells. There's also potentially an endocrine aspect, as NLGN2 is also expressed in pancreatic beta cells (insulin secretion), so maybe there's also a link between the suggested GIP secretion (splanchnic vasodilator) and post-prandial increased POTS/OI symptoms.

Neuroligin 1 Induces Blood Vessel Maturation by Cooperating with the α6 Integrin (2014, Journal of Biological Chemistry)

Modulation of Angiopoietin 2 release from endothelial cells and angiogenesis by the synaptic protein Neuroligin 2 (2018, Biochemical and Biophysical Research Communications)

Altered Pancreatic Islet Function and Morphology in Mice Lacking the Beta-Cell Surface Protein Neuroligin-2 (2013, PLOS ONE)

Worsening Postural Tachycardia Syndrome Is Associated With Increased Glucose-Dependent Insulinotropic Polypeptide Secretion (2022, Hypertension)
 
After looking at the paper again, I realized I should have done my GSEA analysis using p-values, not attention scores. I don't fully understand their method for interpreting the model, but p-values are the metric they used for choosing the top 115 that they say are the most important. It didn't really change the results much since p-value rankings and attention score rankings are pretty similar. (But an exception, for example, is that HLA-C is ranked 22 using p-values and 2077 using attention scores). Again, very similar results on the Genebass CFS data, but the run with p-values can be seen on the last link here.

But I thought it might also be useful to just see the top ranked clusters using the Zhang HEAL2 list of genes. Basically another way to do it from their method of taking their top ranked 115 genes and seeing what pathways these specific genes are enriched in, without considering ranking. Instead I uploaded all of the genes with their -log10 p values to STRING so that it looks at all of them and weighs them by the metric that is apparently useful for interpreting the most important genes.

Here is the link to the enrichment analysis with many different gene set collections. (The page is pretty slow to load and interact with.) I merged items by similarity greater than 0.5, and filtered by FDR < .001 and enrichment score > 1.0. ("Top of input" for direction is what's interesting since it relates to to genes with low p-values.) Filters can be changed near the bottom of the page.

Here are the top 10 STRING local clusters from its protein-protein interaction network, which I think should be most comparable to the enrichment analysis they did that found synaptic function and proteasomes.

Local Network Cluster (STRING)
1749820661113.png

There are some other interesting gene sets in other collections, though. For example, in DISEASES, there are the following four that match my filters. This includes COVID and two diseases that may be related to sex differences,. If I filter with a less conservative FDR, then autistic disorder is the highest enriched disease with FDR=.0078.

Disease-gene Associations (DISEASES)
1749820466326.png

For the "Tissue Expression (TISSUES)" collection, the two gene sets are "autonomic nervous system" and "brain ventricle". For "Human Phenotype (Monarch)", the top ranked phenotype is "myeloid leukemia".

Tissue Expression (TISSUES)
1749820521793.png

Human Phenotype (Monarch)
1749820407091.png
 
Back
Top Bottom