Rapid genotyping of viral samples using Illumina short-read sequencing data
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
The most important information about microorganisms might be their accurate genome sequence. Using current Next Generation Sequencing methods, sequencing data can be generated at an unprecedented pace. However, we still lack tools for the automated and accurate reference-based genotyping of viral sequencing reads. This paper presents our pipeline designed to reconstruct the dominant consensus genome of viral samples and analyze their within-host variability. We benchmarked our approach on numerous datasets and showed that the consensus genome of samples could be obtained reliably without further manual data curation. Our pipeline can be a valuable tool for fast identifying viral samples. The pipeline is publicly available on the project's github page (https://github.com/laczkol/QVG).
Article activity feed
-
SciScore for 10.1101/2022.03.21.485184: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics not detected. Sex as a biological variable not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Table 2: Resources
No key resources detected.
Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:The low read depth can be a limitation of the approach presented here and any reference-based genotyping method. The obtained consensus genome of the adenovirus sample covered 99.4 % of the reference genome. The SNP density appeared to be higher in the pVI, ORF22, and ORF17—ORF19A genes relative to the rest of the genome. In total, we observed seven sites with an AB …
SciScore for 10.1101/2022.03.21.485184: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics not detected. Sex as a biological variable not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Table 2: Resources
No key resources detected.
Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:The low read depth can be a limitation of the approach presented here and any reference-based genotyping method. The obtained consensus genome of the adenovirus sample covered 99.4 % of the reference genome. The SNP density appeared to be higher in the pVI, ORF22, and ORF17—ORF19A genes relative to the rest of the genome. In total, we observed seven sites with an AB > 0. The phylogenetic reconstruction clustered the consensus genome genotyped here and the publicly available genome of the same sample (Figure 6A), agreeing with the clustering based on pairwise genetic distances. We only observed indel mutations between the two mentioned sequences that could be linked to the automatic masking of low-depth genomic regions (read depth < 5). This low divergence of the reference points out the accuracy of the presented pipeline. This work reports a pipeline capable of rapid and automated analysis of viral genomes obtained by NGS. Unlike proprietary software solutions, this pipeline relies on freely available, open-source bioinformatic software. Using parallel execution of tasks, we could obtain consensus genomes of the SARS-CoV2 dataset generated for this study without the need for laborious manual data curation required by Geneious and with similar accuracy. Our pipeline generated good quality consensus genomes using its default settings in most cases, with the FCoV sample as the only exception. Moreover, we could also investigate the intra-host diversity of samples using the allel...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-