Interpretation of DNA raw data

Discussion in 'Laboratory and genetic testing, medical imaging' started by Inara, Jun 6, 2018.

  1. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    Hi all!

    I had a genome analysis done. My raw data was supplied as cvs and xlsx files. Has anybody knowledge about how to interpret these results? Unfortunately, the data can't be uploaded to providers like promethease due to formatting problems (this can't be changed). Or has anybody an idea how to proceed?
     
    merylg and alktipping like this.
  2. Hoopoe

    Hoopoe Senior Member (Voting Rights)

    Messages:
    5,425
    There are programs that can read the data and highlight interesting mutations. I don't know what to recommend because I have not used any such programs.

    You can convert from one format to another with a program. What exactly is the problem (if you know)?

    If you know the desired format, and the format the file is currently in, you can search with Google for "convert zxy to abc" where xyz and abcs are the names of the format.
     
    andypants, alktipping and Inara like this.
  3. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    This was the idea. But, I have tried one online program. The problem lies in the formatting of the files or their content. I haven't tried several providers/programs and I think, due to file problem, it won't work.

    Ok, so I should try converting the files. And then try another program.

    But if this doesn't work? Has anybody analyzed the results himself?
     
    andypants and alktipping like this.
  4. Hoopoe

    Hoopoe Senior Member (Voting Rights)

    Messages:
    5,425
    You need to know the format of the data that you have. This is not necessarily the same thing as the file type. Once you know the format, you can check whether the interpretation program supports it, and if not, either find a way to convert to the right format or find a different interpretation program.

    Presumably you mean .csv and .xlsx

    To figure out the format:

    1. Try opening the .csv in a text editor and see if in the first few lines there is any information on the format. If you're not sure you can copy the information here.

    2. Which company did the genome analysis?

    PS: I'm going to bed now, we can talk more about this tomorrow if you still have problems.
     
    Last edited: Jun 6, 2018
  5. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    Thank you @strategist! I am pretty exhausted today, too, and I have to check the files (other device). The company who did the analysis was Dante Labs.
     
    alktipping likes this.
  6. Hoopoe

    Hoopoe Senior Member (Voting Rights)

    Messages:
    5,425
    On their whole genome sequencing service, Dante labs says

    gVCF and BAM are formats. This doesn't match what you said you were given.
     
    Inara likes this.
  7. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    Yes, that's what they say. There were problems with the gvcf file that couldn't be resolved. Then they provided me with xlsx and csv files, and this doesn't work too.

    Since I don't think anymore I will receive data in a reasonably format that can be used I wanted to try another option.

    I think I will have to communicate with them again.

    I still have to check the csv file...Will do it!
     
  8. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    Ok, I have checked the csv files. There is no info about format etc. The files can be opened and used as excel files.

    I did contact Promethease about this problem and they told me they can't help. The formatting of the files seems to be problematic.
     
  9. Amw66

    Amw66 Senior Member (Voting Rights)

    Messages:
    6,769
    My aunt's 23 and me file is a basic text file that opens on notepad - this is what she used for Promethease.
    Can you export csv as a textfile?
     
    Inara likes this.
  10. Amw66

    Amw66 Senior Member (Voting Rights)

    Messages:
    6,769
    You can save the csv as a textfile using save as - perhaps this will be usable?
     
    Inara likes this.
  11. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    Thank you, @Amw66, I'll try that.
     
  12. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    I saved a csv as .txt, but if I upload it on promethease I get an error message (unknown file format or so). As always.

    Nothing works: gvcf, xlsx, xls, csv or txt. :(
     
  13. Hoopoe

    Hoopoe Senior Member (Voting Rights)

    Messages:
    5,425
    Did you open the csv in a text editor and look at information in the beginning? What does it say? Or does the genetic data start immediately?

    If you want I can take a look. Maybe I can figure out what needs to be done.
     
    Inara likes this.
  14. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    When I opened the csv file in the text editor the genetic data started at once. There was no general information. Should I post some lines from the txt file (original csv file)?

    Thank you! :)
     
  15. Hoopoe

    Hoopoe Senior Member (Voting Rights)

    Messages:
    5,425
    Sure let's try that. First 10 lines please.

    PS: also what's in the xlsx? Is it the same data?
     
    Inara likes this.
  16. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    The xls files contain dna data, too, but of genes that weren't analyzed in the first run. So I have an xls-file with one gene, another one with one gene, one with HLA genes and one with a handful of genes. Then there are the csv files containing the rest. To be honest I don't understand it completely. It seems the csv file are split - but I don't know what belongs together.

    Should I also post the folder structure of the csv files? There are several folders, each containing several csv files.

    P.S. It's nearly bedtime for me, so I will do all the stuff tomorrow.
     
  17. Hoopoe

    Hoopoe Senior Member (Voting Rights)

    Messages:
    5,425
    This is getting complicated. I'll need a copy of all files. If the file size is large or you don't want to share all your data, you can trim the files down to the first 10 lines only. The folder structure and file names and endings need to remain intact also (make a copy of the top level folder, optionally trim the files, compress it and then send it, preferably via PM).
     
    Inara likes this.
  18. sea

    sea Senior Member (Voting Rights)

    Messages:
    476
    Location:
    NSW, Australia
    I hope you find a work around. If not, Dante Labs really should redo your test if they are unable to provide your info in a readable format
     
    andypants, Inara and Trish like this.
  19. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    @strategist, I have sent more details via PM.

    Thank you @sea for your wishes! They should provide usable data. But even though they are very nice and want to help it doesn't work.
     
  20. Inara

    Inara Senior Member (Voting Rights)

    Messages:
    2,734
    What's also strange - several genes weren't analyzed although they say "whole genome sequencing". I don't see the pattern...I can't check it, of course, because there are too many genes. But in my eyes these were relevant genes.
     
    andypants likes this.

Share This Page