Atlas of the plasma proteome in health and disease in 53,026 adults, 2024, Yue-Ting Deng et al

Discussion in ''Conditions related to ME/CFS' news and research' started by chillier, Nov 23, 2024.

  1. chillier

    chillier Senior Member (Voting Rights)

    Messages:
    240
    Link:
    https://www.cell.com/cell/fulltext/S0092-8674(24)01268-6

    Highlights

    Construct a comprehensive proteomics atlas for 1,706 human diseases and traits

    Machine-learning-based big data uncover promising diagnostic and predictive biomarkers

    Identify 37 drug repurposing prospects and 26 potential targets with good safety

    Provide an open-access proteome-phenome resource to advance precision medicine

    Summary
    Large-scale proteomics studies can refine our understanding of health and disease and enable precision medicine. Here, we provide a detailed atlas of 2,920 plasma proteins linking to diseases (406 prevalent and 660 incident) and 986 health-related traits in 53,026 individuals (median follow-up: 14.8 years) from the UK Biobank, representing the most comprehensive proteome profiles to date. This atlas revealed 168,100 protein-disease associations and 554,488 protein-trait associations. Over 650 proteins were shared among at least 50 diseases, and over 1,000 showed sex and age heterogeneity. Furthermore, proteins demonstrated promising potential in disease discrimination (area under the curve [AUC] > 0.80 in 183 diseases). Finally, integrating protein quantitative trait locus data determined 474 causal proteins, providing 37 drug-repurposing opportunities and 26 promising targets with favorable safety profiles. These results provide an open-access comprehensive proteome-phenome resource (https://proteome-phenome-atlas.com/) to help elucidate the biological mechanisms of diseases and accelerate the development of disease biomarkers, prediction models, and therapeutic targets.

    Graphical summary:
    upload_2024-11-23_9-1-39.png
     
    Mij, Sean, SNT Gatchaman and 2 others like this.
  2. chillier

    chillier Senior Member (Voting Rights)

    Messages:
    240
    This is the UK biobank proteomics data, so it's the same dataset that Beentjes/Ponting et al biomarker preprint use. Same set of proteins it seems (2920 in this paper, 2923 in beentjes preprint).

    The authors look for associations between proteins and disease status across many hundreds of diseases with regression models accounting for age, sex, ethnicity, deprivation etc and technical factors too.

    The authors make a distinction between 'prevalent' and 'incident' diseases. That is, at the time of blood sampling is a patient already diagnosed with a disease (prevalent), or are they healthy at the same of sampling but will go on to be diagnosed at any point in the future (incident). In this way, one might argue that there is a causal relationship between a significant protein in the 'incident' category as it precedes probable onset of the disease.

    The authors define diseases with ICD-10 codes, which means that they have G93.3 (post viral fatigue syndromes) and fibromyalgia (M797).

    G93.3 and M797 only appear to have data for the incident category, which might imply there are very few people already diagnosed with these codes at the time of sampling. G93.3 (114 cases, 51710 controls), M797 (452 cases, 45262 controls).

    As far as I can see from the methods, controls are simply defined as people who don't have the disease (though I'm not sure in that case why the total number of cases differ, possibly in part due to differing numbers of prevalent cases).

    For comparison, beenjtes et al have 171 cases and 13,883 controls for their proteomics data analysis. These patients actually have the disease at the time of sampling and it's defined by self reporting of CFS/ reporting of a diagnosis of ME, not by ICD-10 codes. Controls defined as not having G93.3/self reporting CFS etc and also reporting 'good' health - which might explain why there are fewer controls here.

    They provide this link https://proteome-phenome-atlas.com/ which we can compare different proteins and diseases with.
     
    Mij, Sean, SNT Gatchaman and 5 others like this.
  3. chillier

    chillier Senior Member (Voting Rights)

    Messages:
    240
    Both fibromyalgia and post viral fatigue syndrome have hundreds of multiple test corrected significant proteins. I haven't looked very closely at the lists but the top hit for post viral fatigue don't seem to match all that closely to the hits from beentjes et al (they are looking at different things though as I say).

    Here, using incident data they try to predict disease onset from the proteomic data.

    This is for post viral fatigue syndrome and it does not perform well (as seen by the green line being close to the diagonal, if it performs well we want the line to bulge up to the left as much as possible). On the right are the most important proteins for correctly classifying patients and controls in this analysis.
    upload_2024-11-23_9-53-25.png

    Here is the same thing for fibromyalgia. It appears to perform much better. Leptin tops the list, and if you go through them (google them) lots seem to have something to do with metabolism hormones (LEP, FETUB,CHGA,PPY,INSL3).
    upload_2024-11-23_9-59-44.png

    Here's depression, leptin also tops the list for this one (not that good performance):
    upload_2024-11-23_10-4-52.png

    Multiple Sclerosis, looks quite different (not that good performance):
    upload_2024-11-23_10-6-2.png
     
    Mij, Nightsong, Sean and 6 others like this.
  4. Yann04

    Yann04 Senior Member (Voting Rights)

    Messages:
    1,094
    Location:
    Switzerland (Romandie)
    This is really interesting. What could be the meaning of such poor performance in PVFS? That it’s a near useless diagnosis that jumbles up too many different things together?
     

Share This Page