Genetic Predisposition for Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: A Pilot Study 2019 Perez Nathanson Klimas et al

Sly Saint · May 8, 2019

Genetic Predisposition for Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: A Pilot Study

Melanie Perez1, 2, Rajeev Jaundoo1,

Kelly Hilton1, 2, Ana Del Alamo1, 2, Kristina Gemayel1,

Nancy G. Klimas1, 2,

Travis J. Craddock1, 3* and

Lubov Nathanson1, 2*

Introduction: Myalgic Encephalomyelitis/ Chronic Fatigue Syndrome (ME/CFS) is a multifactorial illness of unknown etiology with considerable social and economic impact. To investigate a putative genetic predisposition to ME/CFS we conducted genome-wide single-nucleotide polymorphism (SNP) analysis to identify possible variants. Methods: 383 ME/CFS participants underwent DNA testing using the commercial company 23andMe.

The de-identified genetic data was then filtered to include only non-synonymous and nonsense SNPs from exons and microRNAs, and SNPs close to splice sites. The frequencies of each SNP were calculated within our cohort and compared to frequencies from the Kaviar reference database. Functional annotation of pathway sets containing SNP genes with high frequency in ME/CFS was performed using over-representation analysis via ConsensusPathDB.

Furthermore, these SNPs were also scored using the Combined Annotation Dependent Depletion (CADD) algorithm to gauge their deleteriousness. Results: 5693 SNPs were found to have at least 10% frequency in at least one cohort (ME/CFS or reference) and at least two-fold absolute difference for ME/CFS. Functional analysis identified the majority of SNPs as related to immune system, hormone, metabolic and extracellular matrix organization. CADD scoring identified 517 SNPs in these pathways that are among the 10% most deleteriousness substitutions to the human genome.

Provisionally accepted; full text to be published soon

https://www.frontiersin.org/articles/10.3389/fped.2019.00206/abstract

Amw66 · May 8, 2019

Sly Saint said:
Genetic Predisposition for Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: A Pilot Study
Melanie Perez1, 2, Rajeev Jaundoo1, Kelly Hilton1, 2, Ana Del Alamo1, 2, Kristina Gemayel1, Nancy G. Klimas1, 2, Travis J. Craddock1, 3* and Lubov Nathanson1, 2*

Provisionally accepted; full text to be published soon

https://www.frontiersin.org/articles/10.3389/fped.2019.00206/abstract

@BeautifulDay

John Mac · May 8, 2019

CADD scoring identified 517 SNPs in these pathways that are among the 10% most deleteriousness substitutions to the human genome.

I've no real understanding of what all this means but with 517 not being far off 10% of the total SNP's tested (5693) isn't that what would be expected statistically anyway?

Trish · May 8, 2019

John Mac said:
I've no real understanding of what all this means but with 517 not being far off 10% of the total SNP's tested (5693) isn't that what would be expected statistically anyway?

Maybe each SNP can have more than one substitution, some more deleterious than others.

Jonathan Edwards · May 8, 2019

John Mac said:
I've no real understanding of what all this means but with 517 not being far off 10% of the total SNP's tested (5693) isn't that what would be expected statistically anyway?

As I read it 5693 is the number of SNPs found to have a twofold difference between groups, not the number tested. Without any real denominators in these figures I cannot make anything of their significance.

mariovitali · May 9, 2019

Functional analysis identified the majority of SNPs as related to immune system, hormone, metabolic and extracellular matrix organization. CADD scoring identified 517 SNPs in these pathways that are among the 10% most deleteriousness substitutions to the human genome.

I am curious to see if they identified a gene called SERPINE1 / PAI1 which is part of extracellular matrix organisation.

Others of interest are BMP7 and genes related to Calpain (CAPN1,CAPNS1 ).

Snow Leopard · May 18, 2019

Still looks like more haystack than needles...

wigglethemouse · May 18, 2019

I believe this paper may be this NOVA study where PwME volunteered their 23andMe data. They advertised quite widely for participants so it's a shame not more took part.
https://www.nova.edu/nim/research/mecfs-genes.html

Note 23andMe only tests a subset of genetic data. I believe there have been 5 versions of chip, and each version reports on a slightly different set of SNP's.

Just reading the abstract (not a lot to go on) it may tie in to what Alan Light reported in his video presentation in this thread. Certainly worth comparing the video and full paper when it comes out.
https://www.s4me.info/threads/lives...ations-create-susceptibility-for-me-cfs.7072/

EDIT: Patients paid for their own DnA analysis and shared the data with NOVA, so that allowed this study to happen.

WillowJ · May 19, 2019

wigglethemouse said:
EDIT: Patients paid for their own DnA analysis and shared the data with NOVA, so that allowed this study to happen.

good work, patients.

Not sure the study was big enough to make any conclusions, but I am not sure I have the right data to know that (nor recall enough stats to do the right calculation).

Although 383 is a lot for an ME study, genetic studies assess a huge amount of information and typically need very large studies.

WillowJ · May 19, 2019

Found a citation about the largeness of the studies:
https://www.s4me.info/threads/size-...uirements-for-human-genome-epidemiology.9573/

Trish · May 24, 2019

Looks like the full paper is out now:
https://www.frontiersin.org/articles/10.3389/fped.2019.00206/full

wigglethemouse · May 24, 2019

The one thing I'm not sure about is 23andMe coverage accuracy which could render this paper questionable.

e.g. The first and only data point I picked from supplementary table 1 rs199535154 missense variant that seems to be present in 94% of patients but only 0.2% of the population.

Rich (BB code):


RSID        Frequency_REF   Frequency_MECFS     Frequency_Ratio
rs199535154 0.002312        0.942558747         407.6811189

This could be the CYP2D6 gene that is highlighted in the paper as significant as the first gene discussed in the results section (based on https://www.ncbi.nlm.nih.gov/snp/rs199535154).

The paper states files using v4 and v5 versions of the 23andMe chip were analyzed. I have a v5 file and I have no call for rs199535154 or for hg19:chr22:42524814 in the file.

Very puzzling. I hope I am wrong and I just made a rookie mistake in looking at the data.

wigglethemouse · May 24, 2019

Tagging @mariovitali as you might want to look at supplemental table 2 and re-sort by CADD score to see if any of your liver genes spring out to you
https://www.frontiersin.org/articles/10.3389/fped.2019.00206/full#supplementary-material

Inara · May 24, 2019

wigglethemouse said:
The paper states files using v4 and v5 versions of the 23andMe chip were analyzed. I have a v5 file and I have no call for rs199535154 or for hg19:chr22:42524814 in the file.

Me neither, but I don't have 23andme data. (Although that shouldn't be the reason why I don't find it.)

mariovitali · May 24, 2019

@wigglethemouse

Thank you for tagging me out. I had a quick look and the two i can say are GPBAR1 and CYP2D6 (although i never disclosed the second one).

Regarding GPBAR1 :

Interestingly -for the Liver injury hypothesis- CYP2D6 has to do with Drug Metabolism :

Considerable variation exists in the efficiency and amount of CYP2D6 enzyme produced between individuals. Hence, for drugs that are metabolized by CYP2D6 (that is, are CYP2D6 substrates), certain individuals will eliminate these drugs quickly (ultrarapid metabolizers) while others slowly (poor metabolizers). If a drug is metabolized too quickly, it may decrease the drug's efficacy while if the drug is metabolized too slowly, toxicity may result.[5] So, the dose of the drug may have to be adjusted to take into account of the speed at which it is metabolized by CYP2D6.[6]

I will look more into this in a couple of days.

wigglethemouse · May 25, 2019

I spent some time looking at Supplemental table 2 vs the data in my 23andMe v5 file.

I checked CADD scores > 1 in supplementary table 2 and found 525 variants listed. I looked up my 23andme data for these 525 variants
* My 23andMe v5 file has values for ONLY 190 of them
* 335 rsID's are not present in my v5 file
- I searched by rsID
- and searched by Chr & Position (to catch where 23andMe used internal reference)

In addition it looks like there are also some issues with the Kaviar frequency used in their analysis e.g.. Gene = BCAM is one of the top CADD scores in table 1 in the paper reports a Kaviar frequency of 0.000006

rs3810141 translates to hg19 Chr 19 position 45316804. Kaviar notes two different frequencies depending on which mutation is referenced
C Reference
T (7.8074%)
A (0.0006%)
Using tool http://db.systemsbiology.net/kaviar/cgi-pub/Kaviar.pl

So, it is likely there are a few frequency errors in the frequency data as well.

Not sure what this means. I wish the paper had gone into more details about v4 vs v5 array chips used and how this might affect the data, and was a bit more careful on Kaviar frequency.

Still, for the likes of Alan Light/Bateman Horne center who are doing their own genetics study it must be good to compare data.

wigglethemouse · May 25, 2019

The reference frequency data used in the paper seems to be based on Kaviar frequency from what I can see which in many cases is way different from 1000 genome or genomAD frequency data, so MOST of the high ratio MECFS/Reference frequency items in the spreadsheets are wrong.

In addition the tables seem to include 23andMe miscalls such as CYP2D6 rs1135830 and rs199535154 which have no 1000 genome data. Both have been flagged
http://phase3browser.1000genomes.org/Homo_sapiens/Variation/Population?v=rs1135830;vdb=variation
http://phase3browser.1000genomes.org/Homo_sapiens/Variation/Population?v=rs199535154;vdb=variation

Paper needs a major revision.

Sorry @mariovitali but both GPBAR1 and CYP2D6 that you were interested in seem to be miscalls

mariovitali · May 26, 2019

@wigglethemouse

Regarding CYP2D6 : I have rs28371725, Risk is T with MAF 0.06% :

16.9% are heterozygous in my cohort, with 4.23% missing.

wigglethemouse · May 26, 2019

@mariovitali I think your MAF=0.06% is wrong. Databases are showing 8% allele frequency
https://www.ncbi.nlm.nih.gov/snp/rs28371725

OpenSNP is showing a genotype frequency of 17% which matches your 16.9%
https://www.opensnp.org/snps/rs28371725

mariovitali · May 26, 2019

@wigglethemouse

The MAF I used is from 1000 Genomes

https://www.ncbi.nlm.nih.gov/snp/rs28371725

EDIT : You must switch to the classic site of DBSNP to see 1000 Genomes frequency

How come there is so much difference in OpenSNP ?

As always , thank you for your help

Genetic Predisposition for Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: A Pilot Study 2019 Perez Nathanson Klimas et al

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Moderator

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)

Senior Member (Voting Rights)