Investigating the lasting effects of SARS-CoV-2 infection and the lung microbiota: No persistent microbial alterations in recovered COVID-19 patients with persistent radiological or respiratory abnormalities.
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
SARS-CoV-2 was a global pandemic where infected individuals experienced mild or severe disease. Unfortunately, some patients who experienced severe disease also had lasting abnormalities. The lung microbiome of thirty-eight adult COVID-19 patients with persistent respiratory symptoms and/or radiological abnormalities were analysed. The aim was to investigate if the lasting radiological abnormalities reported in this cohort were associated with an altered airway. Thirty-six bronchoalveolar lavage fluid samples from patients underwent 16S rRNA gene amplicon sequencing and compared to twenty-eight non-fibrotic control samples from a previously published study. COVID-19 patients had statistically significant greater number of genera but at uneven abundances, though not statistically significant, compared to non-fibrotic controls. Permanova suggested that COVID-19 can influence the lung microbiome composition after accounting for multivariate dispersion. Further analysis showed differences in the relative abundances of Actinomyces, Neisseria, Haemophilus, Rothia and Gemella. Indicator species analysis showed that a COVID-19 lung microbiome profile could be driven in part by differences in Fusobacterium, Actinomyces, Catonella, Oribacterium and Mycobacterium. Associations with clinical parameters were lacking apart from CT lung opacification which revealed a significant negative association with number of genera. Differential abundance analysis with MaAsLin2 pointed towards Porphyromonas as a potential explaining genus though this was not significant after post-hoc corrections. DESeq2 revealed enriched oral taxa in the BAL samples suggesting potential oral-translocation reflective of a disease state. Our findings suggest that individuals with persistent radiological abnormalities following SARS-CoV-2 infection have experience subtle shifts in their microbiome profile, but these are not strongly associated with clinical phenotypes and therefore unlikely of significance.
Article activity feed
-
-
This is a study that would be of interest to the field and community. The reviewers have highlighted major concerns with the work presented. Please ensure that you address their comments.
-
Comments to Author
This was an enjoyable read, the basic science behind these issues is fundamental to our wider understanding of lung disease. It was interesting to read that the microbiome essentially recovered but with a slight compositional change. I would like to see a bit more detail on the particular taxa responsible for the compositional differences, so Lines 77-80: The minor but clear differences in composition begs the question of whether there were any particular species that stood out in these groups. I understand this is a small study but the sequencing should be sufficient to pick out some species. Perhaps a note in the conclusions to say that more in-depth sequencing may reveal more details on the community composition and the key players driving these differences. I would like a small description for …
Comments to Author
This was an enjoyable read, the basic science behind these issues is fundamental to our wider understanding of lung disease. It was interesting to read that the microbiome essentially recovered but with a slight compositional change. I would like to see a bit more detail on the particular taxa responsible for the compositional differences, so Lines 77-80: The minor but clear differences in composition begs the question of whether there were any particular species that stood out in these groups. I understand this is a small study but the sequencing should be sufficient to pick out some species. Perhaps a note in the conclusions to say that more in-depth sequencing may reveal more details on the community composition and the key players driving these differences. I would like a small description for opacified lung but that may be because that's not something I've come across. A larger cohort is indeed needed but that is the way of things! We can only wish. Otherwise I have no other queries.
Please rate the manuscript for methodological rigour
Very good
Please rate the quality of the presentation and structure of the manuscript
Very good
To what extent are the conclusions supported by the data?
Strongly support
Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?
No
Is there a potential financial or other conflict of interest between yourself and the author(s)?
No
If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?
Yes
-
Comments to Author
Reviewer Report - Access Microbiology Overall assessment- to the Author This manuscript investigates whether persistent respiratory symptoms or radiological abnormalities following SARS-CoV-2 infection are associated with long-term alterations in the lung microbiota. The study uses 16S rRNA gene amplicon sequencing of bronchoalveolar lavage samples from a well-characterised post-COVID-19 cohort and compares these with previously published non-fibrotic control samples. The work addresses a significant clinical and microbiological question and aligns well with Access Microbiology's sound science remit. The experimental approach is appropriate, ethical approval is clearly stated, and the underlying sequencing data and code are publicly available. The data generally support the conclusions. However, …
Comments to Author
Reviewer Report - Access Microbiology Overall assessment- to the Author This manuscript investigates whether persistent respiratory symptoms or radiological abnormalities following SARS-CoV-2 infection are associated with long-term alterations in the lung microbiota. The study uses 16S rRNA gene amplicon sequencing of bronchoalveolar lavage samples from a well-characterised post-COVID-19 cohort and compares these with previously published non-fibrotic control samples. The work addresses a significant clinical and microbiological question and aligns well with Access Microbiology's sound science remit. The experimental approach is appropriate, ethical approval is clearly stated, and the underlying sequencing data and code are publicly available. The data generally support the conclusions. However, several aspects of methodological clarity, statistical interpretation, and presentation of results require further attention. Major Comments 1. Methodological rigour and reproducibility * The methods are largely appropriate and follow established pipelines (QIIME2, DADA2, phyloseq, 'decontam'). However, several key details require clarification to improve reproducibility: Please explicitly state the negative control strategy used (number of reagent controls included, how they were processed alongside samples, and how 'decontam' was parameterised). The decision to remove taxa below 0.005% relative abundance should be justified, particularly given the low-biomass nature of BAL samples. Six samples were removed due to suspected contamination based on UniFrac clustering and Sphingomonas abundance (line 61-64). Please clarify whether this exclusion was predefined or data-driven, and whether sensitivity analyses were performed with and without these samples. * The use of external control datasets (Invernizzi et al.) introduces potential batch effects (DNA extraction, sequencing platform, primer sets, run effects). While this is acknowledged implicitly, it should be discussed explicitly and, if possible, addressed analytically (e.g. batch variable inclusion or justification for comparability). 2. Statistical analysis and interpretation * PERMANOVA results should report: -The R² value alongside the p-value, * The generalised linear model linking species number to CT opacification is a strength of the manuscript. However, the model specification requires clarification. Please explicitly state the outcome variable, model family, and link function (e.g., binomial/logit), whether interaction terms were tested, and how missing data were handled to ensure reproducibility. For example, 'A generalised linear model with a binomial family and logit link was used to assess the association between species richness and CT opacification (>25%), adjusting for age, sex, and smoking status.' 3. Results organisation and presentation * A clearer organisation of the Results section might be beneficial, such as by dividing: 1. Diversity of alpha and beta 2. Composition of taxa 3. Burden of bacteria 4. Associations with clinical parameters o Figure 1 is information-dense. Consider: - Ensuring that each panel is clearly referenced and described in the text in order. Improving figure captions to state the main conclusion of each panel explicitly. 4. Literature context and discussion * The Discussion is generally balanced and appropriately cautious, particularly in acknowledging limitations in sample size. * However, further engagement with recent post-COVID respiratory microbiome studies (including differences in sampling timepoints and disease severity) would strengthen the contextualisation of the findings. * The discussion of Porphyromonas is interesting but speculative; this should be clearly framed as hypothesis-generating. For example, "Although the association between Porphyromonas and lower CT opacification did not reach statistical significance after multiple testing correction, this observation is hypothesis-generating. Porphyromonas is a recognised member of the respiratory microbiota, but its role in post-COVID lung health remains poorly understood and warrants further investigation in larger, longitudinal studies." 5. Data availability and transparency * The authors have deposited raw sequencing data in NCBI and provided a GitHub repository for analysis code, both of which strongly support reproducibility and meet Access Microbiology requirements. Minor Comments * Please ensure consistent terminology (e.g. "non-fibrotic controls" vs "healthy controls"). It is best to write the full meaning of the pulmonary functional outputs, e.g., FVC (Forced Vital Capacity), at least for the first use. * Clarify whether "species number" refers to observed ASVs aggregated at the genus level. * The Abstract could be tightened to reflect the nuanced nature of the findings better and avoid statements that may appear contradictory to the reported p-values.
Please rate the manuscript for methodological rigour
Very good
Please rate the quality of the presentation and structure of the manuscript
Good
To what extent are the conclusions supported by the data?
Partially support
Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?
No
Is there a potential financial or other conflict of interest between yourself and the author(s)?
No
If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?
Yes
-
Comments to Author
General Assessment The study addresses an important and timely question: whether persistent post-COVID respiratory abnormalities are associated with alterations in the lung microbiota. The manuscript fills a relevant gap in knowledge and provides valuable data from a well-defined patient cohort. While the writing is generally understandable, it could benefit from improvements in clarity and narrative flow. Moreover, the manuscript's structure does not follow the conventional format of a full scientific article, lacking clearly delineated sections such as Introduction, Methods, Results, Discussion, and Conclusions. In addition, it does not include the level of methodological detail or depth expected for a full research article, and its overall length and content are more consistent with a Short …
Comments to Author
General Assessment The study addresses an important and timely question: whether persistent post-COVID respiratory abnormalities are associated with alterations in the lung microbiota. The manuscript fills a relevant gap in knowledge and provides valuable data from a well-defined patient cohort. While the writing is generally understandable, it could benefit from improvements in clarity and narrative flow. Moreover, the manuscript's structure does not follow the conventional format of a full scientific article, lacking clearly delineated sections such as Introduction, Methods, Results, Discussion, and Conclusions. In addition, it does not include the level of methodological detail or depth expected for a full research article, and its overall length and content are more consistent with a Short Communication. Major Comments 1. Study design and cohort description require clarification: 1.1 Lack of clarity on patient selection criteria. Even though the authors cite previous papers (Smith et al., 2024; Invernizzi et al., 2021; Vijayakumar et al., 2022a; 2022b), it is essential to summarize the relevant methodological details and cohort definitions within the present manuscript. Relying on the reader to consult multiple external studies to understand the inclusion criteria limits clarity and undermines reproducibility. These clinical and radiological criteria should be explicitly described in the Methods section so that readers can fully understand the cohort characteristics and replicate the study design. 1.2 Use of external control samples (Invernizzi et al.) and associated limitations. I acknowledge the inherent challenges of conducting microbiome studies in human cohorts, particularly when invasive sampling such as bronchoalveolar lavage is required. These constraints often necessitate the use of publicly available or previously generated datasets. However, relying on control samples from a different study (Invernizzi et al.) introduces several methodological concerns that must be explicitly addressed. Cross-study comparisons can introduce several methodological issues that may affect the validity of the results. Using externally generated control samples increases the likelihood of batch effects arising from differences in sequencing runs, reagent lots, and laboratory workflows. It also introduces discrepancies in processing pipelines, including DNA extraction methods, primer sets, quality filtering steps, and taxonomic assignment through bioinformatic workflows. Furthermore, temporal, geographical, and demographic mismatches between cohorts may influence lung microbiota composition independently of disease status. Together, these factors can artificially inflate false positives or obscure true biological differences. It is therefore recommended that the authors provide a clear justification for combining externally generated control data and explicitly discuss the limitations and potential biases inherent to cross-study comparisons. Including a dedicated paragraph explaining how batch effects were mitigated—or acknowledging that they could not be fully controlled—would greatly enhance methodological transparency and strengthen the credibility of the study's conclusions. 2. Low-biomass microbiome and contamination control insufficiently addressed: The lung is a known low-biomass environment, which demands stringent contamination handling. Only taxa below 0.005% abundance were removed. This threshold is arbitrary and does not follow recognized low-biomass contamination frameworks (e.g., decontam, frequency-based models, or comparison to negative controls). The mention of Sphingomonas contamination indicates typical reagent contaminants, and the authors report having applied a decontam analysis. However, some taxa in the study are well-documented as common oral taxa that frequently appear in BAL and lung tissue samples due to upper-airway carryover rather than representing true lung microbiota. Inclusion of findings from multiple studies investigating the lung microbiome could provide additional context and strengthen the discussion (Charlson et al., 2011; Pragman et al., 2012; Rylance et al., 2016; Dickson et al., 2017; Bassis et al., 2015). Their detection requires careful interpretation within the context of low-biomass environments, where contamination risk is high (Salter et al., 2014; Eisenhofer et al., 2019). The manuscript would benefit from providing more details on the decontam analysis, including the parameters used, thresholds, and how potential oral carryover was distinguished from reagent or procedural contaminants. A clear description of these steps would increase methodological transparency and strengthen confidence in the interpretation of lung microbiota composition. 3. Statistical analyses need clarification and potentially correction: 3.1 The PERMANOVA results indicate a statistically significant difference in community structure (p = 0.025). However, the manuscript does not report any assessment of group dispersion (PERMDISP). Because PERMANOVA is highly sensitive to differences in within-group variability, the absence of a dispersion test substantially limits the interpretation of these findings. The observed significance may reflect differences in dispersion rather than true compositional shifts. I strongly recommend including a PERMDISP analysis and revising the interpretation of the PERMANOVA results accordingly. 3.2 Indicator species analysis may not be appropriate Indicator species analysis may not be appropriate in the present context. This method relies on sufficient sample size and clear group separation to reliably identify taxa that are specifically associated with one condition or group. In a dataset characterized by low sample size and potential contamination, particularly in low-biomass environments such as the lung, the probability of detecting spurious 'indicator' taxa increases substantially. Contaminant genera or taxa driven by stochastic noise may appear as statistically significant indicators simply due to small group sizes or uneven dispersion. Therefore, without rigorous contaminant filtering and adequate statistical power, indicator species analysis is likely to yield false positives and should be interpreted with caution or omitted altogether.. 4. Interpretation overstates the evidence. Several conclusions exceed the strength of the data: 4.1 Association vs. causation unclear Statements imply mechanistic insight ("microbiota-independent mechanisms"), but the study is observational and not powered to infer causality. 4.2 Discussion lacks depth. Important questions not addressed: * Could observed taxa reflect upper-airway contamination rather than true lung microbiota shifts? * How do findings compare to other post-COVID studies? * What are the biological implications of genera like Actinomyces or Fusobacterium being enriched? 5. Minor Comments: 5.1 Writing and structure The introduction could be substantially improved by providing more scientific context and clarifying the study's rationale. Specifically, the authors should include background on the lung microbiota as a low-biomass environment, addressing contamination risks and previous findings in health and disease. They should also contextualize what is known about microbiota alterations during and after COVID-19, highlight gaps in current knowledge regarding persistent pulmonary abnormalities, and clearly state the study's objective and hypothesis. L66-L74: The flow of the Results section jumps between alpha diversity, beta diversity, ddPCR, and clinical correlations without a clear structure. 5.2 Methodological clarity The exact sequencing depth cutoff is not specified. The filtering criteria for genus aggregation are unclear. The rationale for excluding only six samples appears weak. 5.3 Terminology issues "Number of species" likely refers to richness of ASVs, not species. Must be corrected. 5.4 Considerations Given that BAL samples represent a low-biomass environment, additional analyses could further strengthen the study while carefully accounting for contamination risk. Complementary alpha-diversity metrics such as Chao1, Simpson, or Faith's PD could provide a more complete view of richness and phylogenetic diversity, while beta-diversity analyses using Weighted and Unweighted UniFrac may reveal subtle community differences not captured by Bray-Curtis. Multivariate models adjusting for age, sex, smoking, and comorbidities would help isolate associations with COVID-19 while controlling for potential confounders. Furthermore, given the available ASV count tables from DADA2 and the aggregated genus-level abundances, the authors could consider performing complementary analyses using LEfSe or DESeq2. These approaches could help identify taxa consistently associated with COVID-19 or control samples, with DESeq2 additionally allowing adjustment for covariates. Care should be taken in this low-biomass BAL context to filter rare or potentially contaminant taxa, ensuring that observed differences reflect true biological variation rather than technical artifacts or contamination.
Please rate the manuscript for methodological rigour
Satisfactory
Please rate the quality of the presentation and structure of the manuscript
Poor
To what extent are the conclusions supported by the data?
Partially support
Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?
No
Is there a potential financial or other conflict of interest between yourself and the author(s)?
No
If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?
Yes
-
