Using MAGMA on ME/CFS genetic data

I'm hoping to make some expression files to do this based on either the 31 superclusters or 461 clusters, but not separated by dissection like FUMA does. I'm trying to base making these files on FUMA's scripts, using raw data for Siletti 2023 from CellxGene.
Perhaps the file provided by Duncan et al. 2025 paper might be useful? it gives a specificity score matrix file for Siletti et al 2023 atlas.
EDIT: so it's each cluster's gene expression relative to the total gene expression of the dataset. It doesn't work with a covariate.
 
Perhaps the file provided by Duncan et al. 2025 paper might be useful? it gives a specificity score matrix file for Siletti et al 2023 atlas.
EDIT: so it's each cluster's gene expression relative to the total gene expression of the dataset. It doesn't work with a covariate.
I was considering that, and I would guess it'd work fine that way too. It's basically what tralfamadorian did to identify independent signals. I like the idea of starting from real expression values though since it isn't affected by the composition of the rest of the brain.

Also, I think it might be good to use 31 superclusters instead of working with 461 clusters to have less multiple testing burden. Maybe just averaging the clusters into superclusters from Duncan would work.

I'm not sure what you mean by working with a covariate. I think you could still condition on various things with the specificity matrix.

Edit: I haven't thought about it very deeply, but maybe the specificity matrix gets you to more or less the same place as raw expression conditioned on average brain expression, which is what I was planning to do.
 
Last edited:
I'm not sure what you mean by working with a covariate
With covariate I meant a data column with the average gene expression in the dataset that you can choose to condition on or not. From what I remember, the Duncan et al. file only gives the relative gene expression in the dataset per cluster, so not the absolute gene expression for each cluster and for the total dataset separately.
 
With covariate I meant a data column with the average gene expression in the dataset that you can choose to condition on or not. From what I remember, the Duncan et al. file only gives the relative gene expression in the dataset per cluster, so not the absolute gene expression for each cluster and for the total dataset separately.
Got it, yeah, I don't think it's as necessary to condition on the whole dataset, since the values are already showing how much a gene is expressed in that cell type above and beyond the dataset in general, so we can already be somewhat confident that an association with a cell-type using this data isn't confounded by average expression in the dataset.

I think in theory, you could still just download something like GTEx bulk brain expression and attach it as another column to condition on.

Anyway, for the purpose of the interaction thing I was talking about, I think probably it'd work to follow the steps on that website you linked, which I think is what tralfamadorian did to identify 3 independent cell types, and then test interactions of each of those three cell types with a lot of gene sets to see if any significant interactions emerge. Maybe it's not worth doing all the pre-processing steps I was planning.
 
Back
Top Bottom