Genetic Predisposition for Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: A Pilot Study 2019 Perez Nathanson Klimas et al

Sly Saint

Senior Member (Voting Rights)
Genetic Predisposition for Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: A Pilot Study
Melanie Perez1, 2, Rajeev Jaundoo1, Kelly Hilton1, 2, Ana Del Alamo1, 2, Kristina Gemayel1, Nancy G. Klimas1, 2, Travis J. Craddock1, 3* and Lubov Nathanson1, 2*

Introduction: Myalgic Encephalomyelitis/ Chronic Fatigue Syndrome (ME/CFS) is a multifactorial illness of unknown etiology with considerable social and economic impact. To investigate a putative genetic predisposition to ME/CFS we conducted genome-wide single-nucleotide polymorphism (SNP) analysis to identify possible variants. Methods: 383 ME/CFS participants underwent DNA testing using the commercial company 23andMe.

The de-identified genetic data was then filtered to include only non-synonymous and nonsense SNPs from exons and microRNAs, and SNPs close to splice sites. The frequencies of each SNP were calculated within our cohort and compared to frequencies from the Kaviar reference database. Functional annotation of pathway sets containing SNP genes with high frequency in ME/CFS was performed using over-representation analysis via ConsensusPathDB.

Furthermore, these SNPs were also scored using the Combined Annotation Dependent Depletion (CADD) algorithm to gauge their deleteriousness. Results: 5693 SNPs were found to have at least 10% frequency in at least one cohort (ME/CFS or reference) and at least two-fold absolute difference for ME/CFS. Functional analysis identified the majority of SNPs as related to immune system, hormone, metabolic and extracellular matrix organization. CADD scoring identified 517 SNPs in these pathways that are among the 10% most deleteriousness substitutions to the human genome.
Provisionally accepted; full text to be published soon

https://www.frontiersin.org/articles/10.3389/fped.2019.00206/abstract
 
CADD scoring identified 517 SNPs in these pathways that are among the 10% most deleteriousness substitutions to the human genome.

I've no real understanding of what all this means but with 517 not being far off 10% of the total SNP's tested (5693) isn't that what would be expected statistically anyway?
 
I've no real understanding of what all this means but with 517 not being far off 10% of the total SNP's tested (5693) isn't that what would be expected statistically anyway?
Maybe each SNP can have more than one substitution, some more deleterious than others.
 
I've no real understanding of what all this means but with 517 not being far off 10% of the total SNP's tested (5693) isn't that what would be expected statistically anyway?

As I read it 5693 is the number of SNPs found to have a twofold difference between groups, not the number tested. Without any real denominators in these figures I cannot make anything of their significance.
 
Functional analysis identified the majority of SNPs as related to immune system, hormone, metabolic and extracellular matrix organization. CADD scoring identified 517 SNPs in these pathways that are among the 10% most deleteriousness substitutions to the human genome.


I am curious to see if they identified a gene called SERPINE1 / PAI1 which is part of extracellular matrix organisation.

Others of interest are BMP7 and genes related to Calpain (CAPN1,CAPNS1 ).
 
I believe this paper may be this NOVA study where PwME volunteered their 23andMe data. They advertised quite widely for participants so it's a shame not more took part.
https://www.nova.edu/nim/research/mecfs-genes.html

Note 23andMe only tests a subset of genetic data. I believe there have been 5 versions of chip, and each version reports on a slightly different set of SNP's.

Just reading the abstract (not a lot to go on) it may tie in to what Alan Light reported in his video presentation in this thread. Certainly worth comparing the video and full paper when it comes out.
https://www.s4me.info/threads/lives...ations-create-susceptibility-for-me-cfs.7072/

EDIT: Patients paid for their own DnA analysis and shared the data with NOVA, so that allowed this study to happen.
 
EDIT: Patients paid for their own DnA analysis and shared the data with NOVA, so that allowed this study to happen.
good work, patients.

Not sure the study was big enough to make any conclusions, but I am not sure I have the right data to know that (nor recall enough stats to do the right calculation).

Although 383 is a lot for an ME study, genetic studies assess a huge amount of information and typically need very large studies.
 
Last edited:
The one thing I'm not sure about is 23andMe coverage accuracy which could render this paper questionable.

e.g. The first and only data point I picked from supplementary table 1 rs199535154 missense variant that seems to be present in 94% of patients but only 0.2% of the population.
Rich (BB code):

RSID        Frequency_REF   Frequency_MECFS     Frequency_Ratio
rs199535154 0.002312        0.942558747         407.6811189
This could be the CYP2D6 gene that is highlighted in the paper as significant as the first gene discussed in the results section (based on https://www.ncbi.nlm.nih.gov/snp/rs199535154).

The paper states files using v4 and v5 versions of the 23andMe chip were analyzed. I have a v5 file and I have no call for rs199535154 or for hg19:chr22:42524814 in the file.

Very puzzling. I hope I am wrong and I just made a rookie mistake in looking at the data.
 
The paper states files using v4 and v5 versions of the 23andMe chip were analyzed. I have a v5 file and I have no call for rs199535154 or for hg19:chr22:42524814 in the file.
Me neither, but I don't have 23andme data. (Although that shouldn't be the reason why I don't find it.)
 
@wigglethemouse

Thank you for tagging me out. I had a quick look and the two i can say are GPBAR1 and CYP2D6 (although i never disclosed the second one).

Regarding GPBAR1 :

Screen Shot 2019-05-24 at 23.09.17.png


Interestingly -for the Liver injury hypothesis- CYP2D6 has to do with Drug Metabolism :


Considerable variation exists in the efficiency and amount of CYP2D6 enzyme produced between individuals. Hence, for drugs that are metabolized by CYP2D6 (that is, are CYP2D6 substrates), certain individuals will eliminate these drugs quickly (ultrarapid metabolizers) while others slowly (poor metabolizers). If a drug is metabolized too quickly, it may decrease the drug's efficacy while if the drug is metabolized too slowly, toxicity may result.[5] So, the dose of the drug may have to be adjusted to take into account of the speed at which it is metabolized by CYP2D6.[6]

I will look more into this in a couple of days.
 
I spent some time looking at Supplemental table 2 vs the data in my 23andMe v5 file.

I checked CADD scores > 1 in supplementary table 2 and found 525 variants listed. I looked up my 23andme data for these 525 variants
* My 23andMe v5 file has values for ONLY 190 of them
* 335 rsID's are not present in my v5 file
- I searched by rsID
- and searched by Chr & Position (to catch where 23andMe used internal reference)


In addition it looks like there are also some issues with the Kaviar frequency used in their analysis e.g.. Gene = BCAM is one of the top CADD scores in table 1 in the paper reports a Kaviar frequency of 0.000006

rs3810141 translates to hg19 Chr 19 position 45316804. Kaviar notes two different frequencies depending on which mutation is referenced
C Reference
T (7.8074%)
A (0.0006%)
Using tool http://db.systemsbiology.net/kaviar/cgi-pub/Kaviar.pl

So, it is likely there are a few frequency errors in the frequency data as well.

Not sure what this means. I wish the paper had gone into more details about v4 vs v5 array chips used and how this might affect the data, and was a bit more careful on Kaviar frequency.

Still, for the likes of Alan Light/Bateman Horne center who are doing their own genetics study it must be good to compare data.
 
Last edited:
The reference frequency data used in the paper seems to be based on Kaviar frequency from what I can see which in many cases is way different from 1000 genome or genomAD frequency data, so MOST of the high ratio MECFS/Reference frequency items in the spreadsheets are wrong.

In addition the tables seem to include 23andMe miscalls such as CYP2D6 rs1135830 and rs199535154 which have no 1000 genome data. Both have been flagged
http://phase3browser.1000genomes.org/Homo_sapiens/Variation/Population?v=rs1135830;vdb=variation
http://phase3browser.1000genomes.org/Homo_sapiens/Variation/Population?v=rs199535154;vdb=variation


Paper needs a major revision.

Sorry @mariovitali but both GPBAR1 and CYP2D6 that you were interested in seem to be miscalls
 
Last edited:
Back
Top Bottom