Review of the Quality Control Checks Performed by Current Genome-Wide and Targeted-Genome Association Studies on ME/CFS, 2020, Sepulveda et al

Kalliope

Senior Member (Voting Rights)
Provisionally accepted opinion article in frontiers in Pediatrics - Paediatric Neurology

Review of the Quality Control Checks Performed by Current Genome-Wide and Targeted-Genome Association Studies on Myalgic Encephalomyelitis/Chronic Fatigue Syndrome

by Nuno Sepulveda, Anna D. Grabowski, Eliana M. Lacerda and Luis C. Nacul

In summary, given the partial QC checks performed in current GWAS and TGAS, the question of a genetic component in ME/CFS remains open for investigation. To accelerate the discovery of promising disease-gene association, future genetic studies of ME/CFS should set data and methodological standards as high as those followed by the 1000 Human Genome Project and the UK10K project (19,20). Data sharing should also be a general practice to provide the researcher community the opportunity to perform additional checks or alternative analyses of the same data.

 
Unfortunately the list of referenced papers isn't available yet, the citation numbers don't link to anything.
Perez et al is most likely the awful INIM/Klimas team 23andMe study where no quality control was done.
https://www.frontiersin.org/articles/10.3389/fped.2019.00206/full

The new paper has this quote. I don't remember a limit on MAF of 0.1, but perhaps I'm thinking of the supplemental table they provided.
On the one hand, the study of Perez et al (8) only performed the QC check based on the MAF. This study also used a non-standard criterium for selecting SNPs: those with MAF<0.10 in either patients or reported in the Kaviar database were excluded from the analysis.

I seem to remember the UK Biobank site (not ME Biobank) had some great discussion on quality control procedures.....

EDIT : Link to thread on Klimas study
https://www.s4me.info/threads/genet...study-2019-perez-nathanson-klimas-et-al.9415/
 
Last edited:
Summary of the quality control checks they recommend that GWAS studies perform:
  • Identify and exclude monomorphic SNPs (all individuals have same variant for a SNP)
  • Exclude variants with very low minor allele frequency (MAF) (less than ~1-5%)
    • MAF is what percent of the group being studied has the minor allele.
    • "SNPs with a low MAF are rare, therefore power is lacking for detecting SNP-phenotype associations. These SNPs are also more prone to genotyping errors. The MAF threshold should depend on your sample size, larger samples can use lower MAF thresholds. Respectively, for large (N = 100.000) vs. moderate samples (N = 10000), 0.01 and 0.05 are commonly used as MAF threshold." [Source]
  • Use Hardy-Weinberg Equilibrium to test whether allele frequencies (e.g. frequency of AA vs. AT vs. TT) make sense or might be errors.
    • "In theory, deviations of the HWE can result from the genetic selection of a specific allele in patients. Because of this possibility, some researchers prefer to test the HWE using data from healthy controls alone."
  • Check heterozygosity and exclude individuals with high or low heterozygosity compared to other samples.
    • Too little heterozygosity might indicate inbreeding which can affect statistical analysis.
    • Too much heterozygosity might indicate errors in DNA reads.
  • "data of SNPs or of individuals with low genotyping rates should be excluded from the analysis."
    • Genotyping rate is what proportion of DNA reads for a given SNP produced good quality data.
  • "Additional QC checks (e.g., assessing the genetic distance between sampled individuals or checking their ancestry) can also be performed in GWAS and TGAS [...]. However, they are more relevant for large-scale population genetic studies."

And here are the summarized results of their quality control analysis of the 6 GWAS or TGAS that had been done at the time:
  • 2011, Smith AK et al. Convergent genomic studies identify association of GRIK2 and NPAS2 with chronic fatigue syndrome. [S4ME]
    • Did not exclude SNPs based on MAF

  • 2016, Schlauch KA et al. Genome-wide association analysis identifies genetic variations in subjects with myalgic encephalomyelitis/chronic fatigue syndrome. [S4ME]
    • Heterozygosity only used for confirming gender

  • 2018, Herrera et al. Genome-epigenome interactions associated with myalgic encephalomyelitis/chronic fatigue syndrome. [S4ME]
    • All QC performed

  • 2019, Perez et al. Genetic predisposition for immune system, hormone, and metabolic dysfunction in myalgic encephalomyelitis/chronic fatigue syndrome: a pilot study. [S4ME]
    • Only MAF quality control performed

  • 2015, Rajeevan et al. Pathway-focused genetic evaluation of immune and inflammation related genes with chronic fatigue syndrome. [S4ME]
    • Heterozygosity not performed

  • 2016, Johnston et al. A targeted genome association study examining transient receptor potential ion channels, acetylcholine receptors, and adrenergic receptors in chronic fatigue syndrome/myalgic encephalomyelitis. [Article]
    • HWE and genotyping rate not reported

Only Herrera et al performed all these recommended quality control checks.

They give the example of Herrera et al excluding 5% of participants based on heterozygosity:
Consequently, it is unclear whether aberrant heterozygosity rates (due to sample contamination) are one of the explanations for the conflicting evidence of genetic associations reported by these studies. In this regard, Herrera et al. (7) excluded five out of their 109 samples (5%) based on the heterozygosity rate. In simple statistical applications using large sample sizes, a 5% sample contamination might be too low to have a substantial impact on the respective findings. However, in the specific context of GWAS and TGAS where stringent significance levels are used to control for multiple testing, such a level of sample contamination could reduce the underlying statistical power and leave relevant disease-gene associations undetected.

---

Table 1. Summary of the QC checks performed in published GWAS and TGAS on ME/CFS.
Screenshot from 2025-07-27 14-40-52.png
 
Back
Top Bottom