Running FLAMES on DecodeME data

Alright I got POPs to run but unsure if some of the parameters I used are correct.

The setup:
  1. Cloned https://github.com/FinucaneLab/pops , then setup a virtual env using python 3.8 (it needs this for some of it's dependencies), installed requirements.txt in this env
  2. I then created a folder in this dir /pops/data
  3. I downloaded "pops_features_full_FUMA_compatible.tar.gz" unzipping into my pops/data dir
  4. I renamed the folder for ease of martins instructions "pops_features_full_FUMA_compatible" -> "pops_features_full"
  5. I then unzipped @forestglip MAGMA full results into my pops/data dir
  6. I then ran POPs using the following parameters as per instructions/parameters here
    1. python pops.py --gene_annot_path {USER DIR}/pops/data/pops_features_full/gene_annots.txt --feature_mat_prefix {USER DIR}/pops/data/pops_features_full/features_munged/pops_features --num_feature_chunks 116 --magma_prefix {USER DIR}\Documents\pops\data\magma --control_features {USER DIR}/pops/data/pops_features_full/control.features --out_prefix test
This created three files: test.coefs, test.marginals, test.preds

I think this worked, happy to upload the files if anyone can check?

Next the hard part, format credible sets.... I think I'll use FINEMAP: http://www.christianbenner.com/ need to spin up a linux distro real quick though
 
Holy cow, step 4 fine mapping is no joke. I wouldn't even say this is programming, more like puzzle solving and plugging in the right inputs.
Files needed for a FINEMAP:
  1. Master file
  2. Z file
  3. LD (Linkage disequilibrium) file. This must be created from the software LDstore
    1. LDstore requires:
      1. Master file
      2. Z file
      3. BGEN file
Starting from the bottom up
BGEN file:
I *think* that you can get this from the UKBioBank here. Can anyone let me know if that's correct? Also how do you even access this file, do I need to sign up? Also this might be able to use the 1000 genomes, but would be way less accurate?

edit: The UK biobank file would be nearly 2TB.... so I think I would have to use the 1000 genomes BGEN
 
Last edited:
Also this might be able to use the 1000 genomes, but would be way less accurate?
BioBank would be ideal, since it more closely matches the participants. But 1000G still worked well enough to get relatively the same results when I used it in FUMA the first time, mostly just less significant.

I don't think @hotblack and I ended up finding a source for UKB LD files when trying to run MAGMA locally.
 
Back
Top Bottom