Integrative Bioinformatics Analysis Identifies Peripheral Blood Hub Genes and Establishes a Five-Gene Diagnostic Model for Long COVID
Yang Liu, Yuwei Shen, Peilian Cao, Yuqiang Qian, Yutong Chen, Daxing Jiang, Jianan Li
[Line breaks added]
Background
Long COVID (LC) is a multisystem condition with symptoms persisting ≥3 months after SARS-CoV-2 infection. Objective biomarkers are lacking, and its immune and molecular mechanisms remain unclear.
Methods
We integrated five GEO peripheral blood transcriptome datasets (n = 1,717) and applied ComBat batch correction. Differentially expressed genes (DEGs; |log 2FC| > 1, FDR < 0.05) were identified using limma. GO and KEGG enrichment analyses were performed with clusterProfiler. Hub genes were defined via STRING and Cytoscape algorithms. miRNA–mRNA interactions were predicted by miRDB, TargetScan, and miRTarBase.
A logistic regression model based on five core genes (CXCR2, CXCR1, JUN, CXCL8, SELPLG) was evaluated by ROC curves, calibration, and decision analysis. Immune cell infiltration was estimated with CIBERSORT, and Reactome GSEA was conducted using fgsea.
Results
We identified 289 DEGs enriched in neutrophil degranulation, immune response, and coagulation pathways. Seven hub genes emerged: TLR8, FCGR2A, CXCR2, CXCR1, JUN, CXCL8, and SELPLG. The five-gene model achieved AUC > 0.92 across all cohorts.
LC samples showed increased M1 macrophages, neutrophils, and activated dendritic cells, with decreased Tregs, CD8 + T cells, and M2 macrophages. GSEA confirmed dysregulation of innate immunity and coagulation.
Conclusion
This large-scale integrative analysis reveals immune and coagulation disturbances in LC, identifies key diagnostic genes and miRNA networks, and establishes a robust five-gene model for LC detection.
Link (Preprint: Authorea) [Open Access]
Yang Liu, Yuwei Shen, Peilian Cao, Yuqiang Qian, Yutong Chen, Daxing Jiang, Jianan Li
[Line breaks added]
Background
Long COVID (LC) is a multisystem condition with symptoms persisting ≥3 months after SARS-CoV-2 infection. Objective biomarkers are lacking, and its immune and molecular mechanisms remain unclear.
Methods
We integrated five GEO peripheral blood transcriptome datasets (n = 1,717) and applied ComBat batch correction. Differentially expressed genes (DEGs; |log 2FC| > 1, FDR < 0.05) were identified using limma. GO and KEGG enrichment analyses were performed with clusterProfiler. Hub genes were defined via STRING and Cytoscape algorithms. miRNA–mRNA interactions were predicted by miRDB, TargetScan, and miRTarBase.
A logistic regression model based on five core genes (CXCR2, CXCR1, JUN, CXCL8, SELPLG) was evaluated by ROC curves, calibration, and decision analysis. Immune cell infiltration was estimated with CIBERSORT, and Reactome GSEA was conducted using fgsea.
Results
We identified 289 DEGs enriched in neutrophil degranulation, immune response, and coagulation pathways. Seven hub genes emerged: TLR8, FCGR2A, CXCR2, CXCR1, JUN, CXCL8, and SELPLG. The five-gene model achieved AUC > 0.92 across all cohorts.
LC samples showed increased M1 macrophages, neutrophils, and activated dendritic cells, with decreased Tregs, CD8 + T cells, and M2 macrophages. GSEA confirmed dysregulation of innate immunity and coagulation.
Conclusion
This large-scale integrative analysis reveals immune and coagulation disturbances in LC, identifies key diagnostic genes and miRNA networks, and establishes a robust five-gene model for LC detection.
Link (Preprint: Authorea) [Open Access]