See here (different letters used but same idea):Would you be able to answer my earlier question to Prof Ponting:
This non-scientist's understanding would benefit from knowing the variables involved in the gene-set analyses:
Z = B0 + C1.B1 + ... + CnBn + e
... is Z the 13 gene-analysis ones, or is it all 18k?
... is C1 a binary 0/1 for membership of each modeled gene in the gene-set (set of genes expressed in a tissue_
Hugely impressed you have done all that work and can get close to the study results - wrangling the actual data gives a better feel for what was actually done.
To identify tissue specificity of the phenotype, FUMA performs MAGMA gene-property analyses to test relationships between tissue specific gene expression profiles and disease-gene associations. The gene-property analysis is based on the regression model,
Z∼β0+EtβE+AβA+BβB+ϵ
where Z is a gene-based Z-score converted from the gene-based P-value, B is a matrix of several technical confounders included by default. Et is the gene expression value of a testing tissue type c and A is the average expression across tissue types in a data set [...]
We performed a one-sided test (βE>0) which is essentially testing the positive relationship between tissue specificity and genetic association of genes.
The tissue gene-property analysis is a linear regression of all genes. Z is a gene's score from the GWAS and Et is a gene's expression in a tissue. Both of which are continuous, not binary.
For the gene-set analysis (the ubiquitin, synapse gene sets, etc), there's a binary variable on the right side instead - a gene is either in the gene set or not. The z-score on the left is still continuous.