Finding Long-COVID: temporal topic modeling of electronic health records from the N3C and RECOVER programs, 2024, O'Neil et al

forestglip

Senior Member (Voting Rights)
Staff member
Finding Long-COVID: temporal topic modeling of electronic health records from the N3C and RECOVER programs

Shawn T. O’Neil, Charisse Madlock-Brown, Kenneth J. Wilkins, Brenda M. McGrath, Hannah E. Davis, Gina S. Assaf, Hannah Wei, Parya Zareie, Evan T. French, Johanna Loomba, Julie A. McMurry, Andrea Zhou, Christopher G. Chute, Richard A. Moffitt, Emily R. Pfaff, Yun Jae Yoo, Peter Leese, Robert F. Chew, Michael Lieberman, Melissa A. Haendel & the N3C and RECOVER Consortia

Abstract
Post-Acute Sequelae of SARS-CoV-2 infection (PASC), also known as Long-COVID, encompasses a variety of complex and varied outcomes following COVID-19 infection that are still poorly understood.

We clustered over 600 million condition diagnoses from 14 million patients available through the National COVID Cohort Collaborative (N3C), generating hundreds of highly detailed clinical phenotypes. Assessing patient clinical trajectories using these clusters allowed us to identify individual conditions and phenotypes strongly increased after acute infection.

We found many conditions increased in COVID-19 patients compared to controls, and using a novel method to associate patients with clusters over time, we additionally found phenotypes specific to patient sex, age, wave of infection, and PASC diagnosis status.

While many of these results reflect known PASC symptoms, the resolution provided by this unprecedented data scale suggests avenues for improved diagnostics and mechanistic understanding of this multifaceted disease.

Link | PDF (npj Digital Medicine) [Open Access]
 
Several conditions are strongly increased in the PASC cohort, including Chronic fatigue syndrome, Malaise, Finding related to attentiveness, Headache, Migraine (with and without aura), and Anxiety disorder.
A high incidence of postural orthostatic tachycardia syndrome (POTS) has been identified in PASC clinical research39, but a POTS-specific ICD-10 code did not exist prior to October 1, 2022, and therefore POTS is not present in our dataset. The closest available term in the SNOMED hierarchy, Orthostatic hypotension, was found to be significantly elevated in PASC, as were Disorder of the autonomic nervous system and Familial dysautonomia. Many symptoms significant for the PASC cohort, such as Tachycardia, Palpitations, Dizziness and giddiness, Fatigue, and Finding related to attentiveness are suggestive of POTS or similar forms of dysautonomia. The presence of Familial dysautonomia (ICD-10-CM G90.1), a rare genetic disorder, is unlikely to be due to increased screening given that we saw no corresponding uptake in genetic testing. Rather, we suspect that frequent mis-coding may occur because the ICD-10-CM catalog has only one match for the term “dysautonomia” (G90.1 Familial dysautonomia), which when used alone encompasses multiple PASC-related conditions40. Such errors are not uncommon when using medical record software41.
 
Back
Top Bottom