Genetics: HLA-DQA*05:01

Of the 8 hits, only 1 was measured, but the others have a high INFO_SCORE suggestion that their distributions follow Hardy-Weinberg Equilibrium.

1755355696346.png

Yes, I don't understand how that works.
I suppose it's based on the strong correlations between SNPs that you only need to know a few to be pretty certain what the others are. But I was quite surprised that the actually measured SNPs are so low (around 5% of the total, apparently).
 
Yes, I don't understand how that works.

The actual machine that reads the participants' DNA, to make it more afforadable and practical, only actually physically observes a subset of locations on the DNA. In DecodeME, that's 800,000 locations where the output could tell you the actual letter that all the participants' had at these locations. (Though 400,000 were discarded during quality control.)

They then use algorithms and a reference dataset of European individuals who have had many more actual locations/SNPs actually accurately read. It's possible to predict what a DecodeME participant would have at a non-measured location because of LD.

If you only measured SNP1 in DecodeME, and if the reference dataset shows that individuals with SNP1 also almost always have SNP2, you can predict what SNP2 in DecodeME probably is. That's the "imputation" and how you get to around 8,000,000 SNPs in DecodeME.

Then they look at which of these SNPs in the full dataset are related to various genes. Do they cause increased expression of a certain gene? Are they near a gene?

Imputation on Wikipedia:
Genotyping arrays used for genome-wide association studies (GWAS) are based on tagging SNPs and therefore do not directly genotype all variation in the genome. Imputation of the genotypes to a reference panel that has been genotyped for a greater number of variants boosts the coverage of genomic variation beyond the original genotypes. As a consequence, one can assess the effect of more SNPs than those on the original micro-array.
 
I am still puzzled.maybe it means that the string of dots up and down for a skyscraper on the Manhattan plot are mostly 'filled in'. But I fail to see how you can fill in a skyscraper unless you have at least one high dot, which, if it is way above background, seems likely to be real.
How the actual math works, I don't know. But somehow it returns SNPs that are a different significance from the actual measured SNPs.
 
Back
Top Bottom