Preprint Initial findings from the DecodeME genome-wide association study of myalgic encephalomyelitis/chronic fatigue syndrome, 2025, DecodeMe Collaboration

Would you be able to answer my earlier question to Prof Ponting:

This non-scientist's understanding would benefit from knowing the variables involved in the gene-set analyses:
Z = B0 + C1.B1 + ... + CnBn + e


... is Z the 13 gene-analysis ones, or is it all 18k?
... is C1 a binary 0/1 for membership of each modeled gene in the gene-set (set of genes expressed in a tissue_

Hugely impressed you have done all that work and can get close to the study results - wrangling the actual data gives a better feel for what was actually done.
See here (different letters used but same idea):
To identify tissue specificity of the phenotype, FUMA performs MAGMA gene-property analyses to test relationships between tissue specific gene expression profiles and disease-gene associations. The gene-property analysis is based on the regression model,

Z∼β0+EtβE+AβA+BβB+ϵ

where Z is a gene-based Z-score converted from the gene-based P-value, B is a matrix of several technical confounders included by default. Et is the gene expression value of a testing tissue type c and A is the average expression across tissue types in a data set [...]

We performed a one-sided test (βE>0) which is essentially testing the positive relationship between tissue specificity and genetic association of genes.

The tissue gene-property analysis is a linear regression of all genes. Z is a gene's score from the GWAS and Et is a gene's expression in a tissue. Both of which are continuous, not binary.

For the gene-set analysis (the ubiquitin, synapse gene sets, etc), there's a binary variable on the right side instead - a gene is either in the gene set or not. The z-score on the left is still continuous.
 
See here (different letters used but same idea):


The tissue gene-property analysis is a linear regression of all genes. Z is a gene's score from the GWAS and Et is a gene's expression in a tissue. Both of which are continuous, not binary.

For the gene-set analysis (the ubiquitin, synapse gene sets, etc), there's a binary variable on the right side instead - a gene is either in the gene set or not. The z-score on the left is still continuous.

Fantastic - thanks so much. The paper confused me:
We considered 54 tissue types and identified significant enrichment of these genes’ expression for 13 (p < 0.05/54), all of which were brain regions

it wasn't clear what "these" referred to.
 
Back
Top Bottom