Benchmarking large language models for cell-free RNA diagnostic biomarker discovery, 2026, Gaudio et al

forestglip · Friday at 4:55 PM

Benchmarking large language models for cell-free RNA diagnostic biomarker discovery

Gaudio, Hunter A.; Bliss, Andrew; Loy, Conor J.; Eweis-LaBolle, Daniel; Gardella, Anne E.; De Vlaminck, Iwijn

Abstract
Large language models can synthesize biomedical knowledge, parse vast amounts of data, and generate code, positioning them as promising tools for biomarker discovery from high-throughput omics data.

Here, we benchmark six models from OpenAI, Anthropic, and Google on plasma cell-free RNA datasets spanning three clinical cohorts: Kawasaki disease versus multisystem inflammatory syndrome in children, active tuberculosis versus symptomatic respiratory controls, and myalgic encephalomyelitis/chronic fatigue syndrome versus sedentary controls. We evaluate literature-guided nomination of diagnostic gene panels for downstream machine learning and autonomous construction of end-to-end classifiers from raw count matrices to held-out test predictions.

Despite prompt adherence issues, model-nominated panels recapitulate canonical immune pathways and outperform random panels across cohorts, even matching differential gene expression baselines in the tuberculosis cohort. End-to-end automation proves feasible but is model- and task-dependent.

One model approaches conventional performance for Kawasaki disease versus multisystem inflammatory syndrome in children, whereas performance decreases for tuberculosis and myalgic encephalomyelitis/chronic fatigue syndrome cohorts.

These findings delineate current capabilities and limitations of large language models in diagnostics and open a path for their future use in biomarker discovery.

Web | DOI | PDF | Nature Communications | Open Access

Benchmarking large language models for cell-free RNA diagnostic biomarker discovery, 2026, Gaudio et al

forestglip

Moderator