Running FLAMES on DecodeME data

I've used the 1000G Phase3 EUR LD reference panel, will try to use the UK biobank LD panel later.
I got slightly different results using the UK biobank LD panel (UKBrelease2b10kEuropean). I specified the full rather than effective sample size (n = 275488).

FLAMES highlighted 31 instead of 32 causal genes (and VRK2 appears twice again, so it's 30 genes). The main differences are:

No longer in FLAMES with UK biobank LD FLAMES
RPP40
PLCL1
NEURL1
NEK1

New in FLAMES with UK biobank LD FLAMES
MLLT10
HTT
SSR1

It probably makes more sense to focus on the UK biobank LD panel so would focus on this list. I'll paste it below again:

ARFGEF2
CA10
UNC13C
SHISA6
SOX6
MMS22L
OLFM4
PEBP1
ZNF644
LRRC7
DCC
MLLT10
HTT
CACNA1E
VRK2
ALK
VRK2
MICALL2
KIAA1239
STT3B
VPS54
RIMS1
PTPRE
NR2F1
PTBP2
RP11-147C23.1
ADARB2
SMCHD1
SSR1
LAMA2
HABP2
 

Attachments

It probably makes more sense to focus on the UK biobank LD panel so would focus on this list. I'll paste it below again:

ARFGEF2
CA10
UNC13C
SHISA6
SOX6
MMS22L
OLFM4
PEBP1
ZNF644
LRRC7
DCC
MLLT10
HTT
CACNA1E
VRK2
ALK
VRK2
MICALL2
KIAA1239
STT3B
VPS54
RIMS1
PTPRE
NR2F1
PTBP2
RP11-147C23.1
ADARB2
SMCHD1
SSR1
LAMA2
HABP2
Nice, thanks.

When doing the enrichment again with those, only one gene set is highlighted: synapse.

Another gene set, which I guess is targets of the microRNA mir-3620, was also significant, but was not highlighted, probably due to similarity to the synapse gene set.

term_idterm_namehighlightedadjusted_p_valueterm_sizequery_sizeintersection_sizeeffective_domain_sizeintersections
GO:0045202synapseTRUE0.0159160827922155ARFGEF2,UNC13C,SHISA6,DCC,HTT,CACNA1E,VPS54,RIMS1,LAMA2
MIRNA:hsa-mir-3620hsa-mir-3620FALSE0.024084027816638ARFGEF2,SHISA6,PEBP1,MLLT10,CACNA1E,PTBP2,SMCHD1,SSR1

It seems the other synapse-related gene sets are not significant anymore mainly because they include the now removed PLCL1 and/or NEURL1.

Edit: I might have been wrong about the reason mir-3620 isn't highlighted. I think the "highlighting" is just for GO terms. It might be worth looking at this miRNA.
 
Last edited:
@forestglip, I have a question about this data.

when I click on GO:0045202, the link takes to another website and when I click on the link there to all direct and indirect annotations to synapse (excluding "regulates"), I get this:


This page says that the synapse data is from domestic cattle and domestic cats. Do we have any idea of the relevance of this information to human synapses?
 
@forestglip, I have a question about this data.

when I click on GO:0045202, the link takes to another website and when I click on the link there to all direct and indirect annotations to synapse (excluding "regulates"), I get this:


This page says that the synapse data is from domestic cattle and domestic cats. Do we have any idea of the relevance of this information to human synapses?
Good question. I don't know a lot about the gene ontology database. [Edit: For all I know, the annotations for all the different species are just the different forms of the same genes, just for a different animal. But I don't know how true that is.]

In any case, it looks like all of the DecodeME genes that showed up in the synapse gene set, and thus made it significant, are annotated under "Homo sapiens" so probably relevant to humans.

The gene set is also available on msigdb, where it says source species "Homo sapiens". If expanding the link to Show Members near the bottom, all of the following synapse DecodeME genes are there:
ARFGEF2,UNC13C,SHISA6,DCC,HTT,CACNA1E,VPS54,RIMS1,LAMA2
 
Last edited:
Thanks, quite a lot of differences unfortunately. Did it only focus on the hits above 5*10^-8?
Yes, exactly. I am not familiar with FLAMES. I used a classic approach (SusieR selects the true signal by posterior inclusion probability, PIP), no machine learning. Also, I used eQTLs from GTEx V10 to map true causal variants to genes. I wonder if this affected the results.
 
RP11-147C23.1 at location 1:97037083
For 1:97037083 it seems that FLAMES prioritised PTBP2, with RP11-147C23.1 being a lower-scoring alternative that doesn't pass the prioritisation threshold (0.051 = ~5.1% of the scaled / locus normalised FLAMES score). But there is a second RP11-147C23.1 entry - the variant being 1:96274668 / rs4615895 (FLAMES_scaled=0.807 and estimated cumulative precision = 0.94); . Querying the OpenGWAS/PheWAS API it looks like the top associations for the latter are adiposity-related
 
Last edited:
I specified the full rather than effective sample size (n = 275488).
I contacted the authors and they clarified that FLAMES requires the effective sample size ("the sample size is used for fine-mapping and should ideally be Neff"). So I'll try to do it again with the UK biobank controls and effective sample size.
 
This 2020 paper on the H-MAGMA method also includes a list of 1,841 pleiotropic genes that were found in 4 or more out of 5 tested neuropsychiatric conditions such as ADHD, autism, schizophrenia, bipolar disorder and depression.

I checked the FLAMES list from above to see of any appear in it and got the following results.
1780771931861.webp
Bit surprised to see ARFGEF2 and OLFM4 in them, thought these had a more specific function but apparently not. Also makes it more likely that the OLFM4 hit has a neurological rather than immunological meaning.
 
STT3B : Could be important because it points to N-Linked glycosylation. Note below asparagine, dolichol and glycosylation defects

STT3B encodes the catalytic subunit of the oligosaccharyltransferase complex that transfers oligosaccharides onto asparagine residues during N-glycosylation. It acts on Asn-X-Ser/Thr motifs and functions during both co- and post-translational phases at the endoplasmic reticulum translocon. By enabling glycosylation of skipped or otherwise inaccessible sites, STT3B supports protein maturation in the endoplasmic reticulum membrane.

At the mechanistic level, STT3B uses dolichol-linked oligosaccharides as donors to modify nascent proteins and help complete post-translational glycosylation. This activity is linked to the handling of misfolded proteins, including the AMYL-TTR "Asp-38" variant, which is targeted for degradation through the ERAD pathway. Protein expression is high in the proximal digestive tract, hepatobiliary system, and pancreas, with additional measurements in the female reproductive system, endocrine system, connective tissue, and male reproductive system.

Pathogenic variants in STT3B are reported in congenital disorder of glycosylation, type Ix. Defects in this gene are a cause of congenital disorder of glycosylation type Ix, and related disorders listed include congenital disorder of deglycosylation 1, congenital disorder of glycosylation type Iic, and congenital disorder of glycosylation type In.

Some posts on Asparagine, Dolichol, Glycosylation can be found in this thread : https://s4me.info/threads/n-linked-glycosylation.41469

Dolichol can be also found here : https://s4me.info/threads/genetics-chromosome-6-btn2a2-and-btn3a3-btn2a1.45503/page-9#post-690615
 
Back
Top Bottom