Long COVID Longitudinal Symptoms Burden Clusters Within A National Community-Based Cohort
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Long COVID is clinically heterogeneous, with evolving symptom trajectories that complicate classification. Prior clustering studies often rely on Electronic Health Records data, risking underreporting. We used longitudinal, community-based data to characterize symptom clusters and risk factors.
Methods
We analyzed CHASING COVID Cohort participants with confirmed SARS-CoV-2 infection between December 2020 and December 2022, ≥12 months of follow-up, and long COVID (≥1 new symptom and concurrent activity limitation 3–12 months post-infection, both absent pre-infection). The infection in this window was the index infection; those with pre-index long COVID were excluded. Symptoms were self-reported pre-infection and at ∼3, 6, 9, and 12 months. Missing data were handled via multiple imputation by chained equations (30 datasets). Longitudinal K-means clustering was performed within each imputed dataset, with hierarchical aggregation to derive final assignments. Multivariable logistic regression (adjusting for age, sex) assessed associations of demographic, clinical, and social factors with cluster membership. Within the highest-burden cluster, hierarchical clustering identified symptom phenotypes.
Results
Of 1,787 infected participants, 511 met criteria (22% ≥50 years; 55% female; 60% White non-Hispanic; 54% mental health disorder; 7.2% immunodeficiency; 21% ≥2 comorbidities; 30% prior infection; 24% never vaccinated pre-index). Three clusters emerged: highest, moderate, and lowest burden. The highest-burden cluster (median 6 symptoms at 6 and 9 months) was characterized by fatigue (57%–74%), concentration difficulty, post-exertional malaise, myalgia, sleep disturbance, gastrointestinal symptoms, headache, irritability, and mobility limitations (each ∼46%–53%). The moderate cluster peaked at 3 symptoms (fatigue 42%); the lowest remained ∼1 symptom (fatigue 24% at 12 months). Older age (aOR 2.68, 95% CI 1.59–4.53), female sex (2.36, 1.48–3.76), mental health disorder (2.65, 1.65–4.25), and immunodeficiency (4.23, 1.71–10.51) were associated with highest vs lowest burden. Within the highest-burden cluster, three phenotypes emerged: neurological/multisystemic, psychiatric/neurological, and physical/respiratory.
Conclusions
Distinct long-COVID clusters and phenotypes underscore heterogeneity and support tailored management and risk stratification.