ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Background
SARS-CoV-2 is the highly transmissible etiologic agent of coronavirus disease 2019 (COVID-19) and has become a global scientific and public health challenge since December 2019. Several new variants of SARS-CoV-2 have emerged globally raising concern about prevention and treatment of COVID-19. Early detection and in-depth analysis of the emerging variants allowing pre-emptive alert and mitigation efforts are thus of paramount importance.
Results
Here we present ClusTRace, a novel bioinformatic pipeline for a fast and scalable analysis of sequence clusters or clades in large viral phylogenies. ClusTRace offers several high-level functionalities including lineage assignment, outlier filtering, aligning, phylogenetic tree reconstruction, cluster extraction, variant calling, visualization and reporting. ClusTRace was developed as an aid for COVID-19 transmission chain tracing in Finland with the main emphasis on fast screening of phylogenies for markers of super-spreading events and other features of concern, such as high rates of cluster growth and/or accumulation of novel mutations.
Conclusions
ClusTRace provides an effective interface that can significantly cut down learning and operating costs related to complex bioinformatic analysis of large viral sequence sets and phylogenies. All code is freely available from https://bitbucket.org/plyusnin/clustrace/
Article activity feed
-
-
SciScore for 10.1101/2021.12.09.471941: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources In the next step, filtered sequence sets for each lineage are aligned with MAFFT v7 (Katoh and Standley, 2013). MAFFTsuggested: (MAFFT, RRID:SCR_011811)Amino acid (aa) variants are called with snpEff (Cingolani et al., 2012). snpEffsuggested: (SnpEff, RRID:SCR_005191)2.2 Adding new sequences to your analysis: ClusTRace supports updating your analysis with new sequence batches. ClusTRacesuggested: NoneResults from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We …SciScore for 10.1101/2021.12.09.471941: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources In the next step, filtered sequence sets for each lineage are aligned with MAFFT v7 (Katoh and Standley, 2013). MAFFTsuggested: (MAFFT, RRID:SCR_011811)Amino acid (aa) variants are called with snpEff (Cingolani et al., 2012). snpEffsuggested: (SnpEff, RRID:SCR_005191)2.2 Adding new sequences to your analysis: ClusTRace supports updating your analysis with new sequence batches. ClusTRacesuggested: NoneResults from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-