Genetics: Chromosome 1 RABGAP1L

I may have interaction with RAB3GAP and RAB1A does it do a similar thing I wonder,the paper mentions Rapamycin


Rab3gap1 palmitoylation cycling modulates cardiomyocyte exocytosis and atrial natriuretic peptide release​

 
Last edited:
I initially thought that the causal gene(s) for this region on chromsome 1 are too uncertain given how many protein-coding genes are packed in this region. I still think this is the case.

But zooming out, it seems that there might be two independent signals close to each other. A small group of SNPs that is near the significance threshold, but then there's also a longer structure around 10^-6 that seems to correspond to the length of RABGAP1L.

1758812399402.png

The ones in the RABGAP1L region are no longer in strong LD with the top hit. So perhaps this 'dragon-like figure' of SNPs with p-values around 10^-6 point to the RABGAP1L gene, even if the SNPs with the lowest p-values close to it point to another gene?

1758812521063.png
 
A small group of SNPs that is near the significance threshold, but then there's also a longer structure around 10^-6 that seems to correspond to the length of RABGAP1L.
Hmm, I'd be cautious about ascribing similarities in length between a gene and a signal as more than a coincidence.

Here's that area except with LD colors in reference to the top SNP (purple diamond) of that little length of SNPs:
lz.png

It looks like that's just a stretch of SNPs that are in strong LD with each other, so if one is significant, the rest are as well.

Edit: Corrected chart to use European LD.
 
Last edited:
It looks like that's just a stretch of SNPs that are in strong LD with each other, so if one is significant, the rest are as well.
Yes but perhaps they are in strong LD because they are related to the long RABGAP1L gene?

It also looks different from the SNPs that hit the significance threshold close by and which seems more ambiguous (not sure which gene it points to). But the dragonlike-SNPs at 10^-6 probably point to RABGAP1L or GPR52.
 
Yes but perhaps they are in strong LD because they are related to the long RABGAP1L gene?
I think generally LD is related to where DNA tends to be cut when DNA shuffling is happening during cell division. I'm not sure if there's any relation between regions that tend to stay together and the locations of specific genes, but maybe it happens. I don't know enough to say it's not possible.

Even if that was the case, a significant SNP in that region would make the whole long region significant whether it's causally related to RABGAP1L or any other gene.

It also looks different from the SNPs that hit the significance threshold close by and which seems more ambiguous (not sure which gene it points to). But the dragonlike-SNPs at 10^-6 probably point to RABGAP1L or GPR52.
It's possible. I wouldn't discount that it might just be LD with the same signal as the main locus though.
 
It also looks different from the SNPs that hit the significance threshold close by and which seems more ambiguous (not sure which gene it points to).
The most significant hit there (1:173,846,152 / rs12071663 ?) seems to sit pretty comfortably in an intronic region of DARS2, looking at UCSC genome browser

I'm not sure if there's any relation between regions that tend to stay together and the locations of specific genes, but maybe it happens. I don't know enough to say it's not possible.
That happened to be a topic for one of the lectures I attended earlier this year--certain regions of the genome can be more or less prone to crossover events due to having a sequence that facilitates things like recruitment of certain DNA excision & repair proteins, or formation of 3D structures in the DNA itself that encourage/discourage crossover. It's definitely possible that certain genes/regions of the genome evolved to have sequences that discourage crossover within that region--which could be because it was advantageous to keep those regions together transcriptionally, or because that sequence happens to create a protein with a certain advantageous structure.

Yes but perhaps they are in strong LD because they are related to the long RABGAP1L gene?
So I think you'd be right to suspect that something about the RABGAP1L gene causes the "dragon" shape. Molecularly, there are probably two "stable" regions there (denoted by the two different LD colors in the dragon) with a small stretch more prone to crossover right between them.

Even if that was the case, a significant SNP in that region would make the whole long region significant whether it's causally related to RABGAP1L or any other gene.
And you'd be right to conclude that as well. We just know that at least one thing in that general region is associated with ME/CFS--best candidate would be the most signficant hit in DARS2, but it could be any of the stretches all in LD with eachother. Looking at the supplementary tables, it seems like the final gene annotation chosen for DecodeME's top SNPs was heavily influenced by number of tissues in the coloc data.
 
Thanks for the interesting discussion. What do the arrows beside gene names signify in the charts? Some go right and some go left.
The arrows indicate which strand of the DNA the gene is actually on, since both strands of DNA encode genes. Sometimes genes can even overlap in the same position on opposite strands, so a given sequence (e.g. ATTACGCA) belongs to gene 1 but the complementary nucleotides for that sequence belong to gene 2 (i.e. TAATGCGT--actually it would be the reverse complement TGCGTAAT because transcription happens in the opposite direction on the opposite strand)

[Edit: More specifically, the arrows indicate which way DNA polymerase travels when the gene is being transcribed, which differentiates the strands]
 
Last edited:
But the dragonlike-SNPs at 10^-6 probably point to RABGAP1L or GPR52.

Linking to the thread for a fibromyalgia GWAS since there was a signal in the same region and they believe it may be related to GPR52:
The strongest association was with a coding variant in HTT , the causal gene for Huntington's disease. Gene prioritization implicated the HTT regulator GPR52 ,

The two relevant loci in DecodeME, showing that there may be a signal in the HTT region in ME/CFS as well (black triangles are locations of top hits from fibromyalgia):

1.png
13.png
 
I used the Open GWAS API to retrieve traits associated with the lead variants at each of the 8 DecodeME loci. I wrote more details on the CA10 locus thread.

Here are traits associated with rs12071663/1:173846152:T:C (the lead DecodeME variant near RABGAP1L), with p<1e-6, starting from most significant:
For all rows:
chr = 1
position (GRCh37) = 173815290
ea (effect allele) = C
nea (non-effect allele) = T

idtraiteaf (effect allele frequency)betasepn
eqtl-a-ENSG00000117601ENSG00000117601 (SERPINC1)0.3358070.1633140.01252727.5753e-3926483
eqtl-a-ENSG00000152061ENSG00000152061 (RABGAP1L)0.3358070.1104850.0125681.48491e-184927
eqtl-a-ENSG00000117593ENSG00000117593 (DARS2)0.335807-0.1001460.01257421.65768e-1527308
ebi-a-GCST90014008IGF 1 (UKB data field 30770)nan0.01557560.002195071.28647e-12387834
ieu-b-5135Age at menarche0.4040.01510.00263.68799e-09
ukb-d-30770_irntIGF-10.323810.0137610.00248473.05499e-08342439
ukb-d-30770_rawIGF-10.323810.0779680.0140923.15631e-08342439
ebi-a-GCST90002363Red blood cell count0.328225-0.0111440.0020213.57998e-08545203
ieu-b-5136Age at menarche0.3240.01680.00313.80601e-08
eqtl-a-ENSG00000203739ENSG00000203739 (AL645568.1)0.335807-0.06907060.0125894.09742e-081241
ebi-a-GCST90002403Red blood cell count0.325706-0.01174210.002229361.40001e-07408112
ebi-a-GCST90002351Neutrophil count0.327816-0.0108430.002092.15998e-07519288
ukb-d-30030_irntHaematocrit percentage0.323688-0.01065620.002079782.99744e-07350475
ebi-a-GCST90002398Neutrophil count0.325705-0.01162930.002285133.59998e-07408112
ebi-a-GCST90002384Hemoglobin0.325713-0.01153410.0022683.69999e-07408112
ebi-a-GCST90002310Hemoglobin concentration0.328232-0.0103860.0020574.54004e-07563946
ebi-a-GCST90016621Schizophrenia vs bipolar disorder (ordinary least squares (OLS))nan-0.0120.00244.79999e-0761027
ebi-a-GCST90002383Hematocrit0.325719-0.01137390.002269795.39995e-07408112
ebi-a-GCST90018960Hematocrit0.377649-0.00870.00175.41701e-07350475
The trait most significantly associated with this variant is expression of the SERPINC1 gene.

The C allele at this variant is associated with decreased risk of ME/CFS (odds ratio less than 1/beta less than 0 in DecodeME) and increased expression of SERPINC1 (beta greater than 0 above).
 
Last edited:
So, just to make sure I'm following:
  • A 'T' at this rs12071663 location is associated with increased risk of ME/CFS.
  • Previously, we had that a 'T' there was associated with decreased RABGAP1L expression -- was that info from GTEx? Did we know what tissue that was supposed to be in?
  • Now, by comparing GWAS summary stats, you're seeing that the 'T' variant is associated with lower expression of SERPINC1?
Let me know if there's a good way of figuring out what study ieu open gwas is pulling that info from. From googling the author it looks like the study that found the SERPINC1 connection was probably a study of blood samples?

The human protein atlas is saying that SERPINC1 is expressed in the liver and not at all in the brain. They also say this:
The protein encoded by this gene, antithrombin III, is a plasma protease inhibitor and a member of the serpin superfamily. This protein inhibits thrombin as well as other activated serine proteases of the coagulation system, and it regulates the blood coagulation cascade. The protein includes two functional domains: the heparin binding-domain at the N-terminus of the mature protein, and the reactive site domain at the C-terminus. The inhibitory activity is enhanced by the presence of heparin. Numerous mutations have been identified for this gene, many of which are known to cause antithrombin-III deficiency which constitutes a strong risk factor for thrombosis. A reduction in the serum level of this protein is associated with severe cases of Coronavirus Disease 19 (COVID-19).
 
A 'T' at this rs12071663 location is associated with increased risk of ME/CFS.
Yes. In Table 3 of the DecodeME paper, the variant is 1:173846152:T:C. Since C is the letter at the end, that's what the odds ratio in the table is describing. The odds ratio is less than 1 (0.927), so having a C is associated with decreased risk. Thus, having a T is associated with increased risk.

Previously, we had that a 'T' there was associated with decreased RABGAP1L expression -- was that info from GTEx?
Yes, the DecodeME study used GTEx for that. The table above also has RABGAP1L association with this variant, and the direction is the same.

Now, by comparing GWAS summary stats, you're seeing that the 'T' variant is associated with lower expression of SERPINC1?
Yep.

Let me know if there's a good way of figuring out what study ieu open gwas is pulling that info from. From googling the author it looks like the study that found the SERPINC1 connection was probably a study of blood samples?
The eQTL traits in Open GWAS appear to be sourced from the eQTLGen consortium (linked at the top of the main datasets page after clicking the "show batches" button).
The aim of the eQTLGen consortium is to investigate the genetic architecture of blood gene expression and to understand the genetic underpinnings of complex traits.
The website links to this study: Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression (2021, Võsa et al, Nature Genetics)
 
Back
Top Bottom