Alcov: Estimating Variant of Concern Abundance from SARS-CoV-2 Wastewater Sequencing Data

This article has been Reviewed by the following groups

Read the full article

Abstract

Detection of SARS-CoV-2 in wastewater is an important strategy for community level surveillance. Variants of concern (VOCs) can be detected in the wastewater samples using next generation sequencing, however it can be challenging to determine the relative abundance of different VOCs since the reads cannot be assembled into complete genomes. Here, we present Alcov (abundance learning of SARS-CoV-2 variants), a tool that uses mutation frequencies in SARS-CoV-2 sequencing data to predict the distribution of VOC lineages in the sample. We used Alcov to predict the distributions of lineages from three wastewater samples which agreed well with clinical data. By predicting not just which VOCs are present, but their relative abundances in the population, Alcov extracts a more complete snapshot of the variants which are circulating in a community.

Article activity feed

  1. SciScore for 10.1101/2021.06.03.21258306: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    In all of our experiments we preprocessed the reads using SeqPrep [4] to quality filter and merge paired reads if required.
    SeqPrep
    suggested: (SeqPrep, RRID:SCR_013004)
    Next we aligned the reads to the reference genome using BWA [5] and used samtools [6] to sort the reads and save the file as a BAM.
    BWA
    suggested: (BWA, RRID:SCR_010910)
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    We implemented the model using scikit-learn version 0.24 [10] which allows for the additional constraint that all βj must be positive.
    scikit-learn
    suggested: (scikit-learn, RRID:SCR_002577)

    Results from OddPub: Thank you for sharing your code.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Another limitation is the limited lineage information which is included. Since Alcov only looks for VOCs, it may attribute the presence of mutations from non-VOC lineages to VOCs. This could be addressed by also including the specific lineages which are circulating in the community which is being sampled (as determined by clinical sequencing data). Finally, it is worth noting that the approach taken by Alcov is not specific to SARS-CoV-2. The tool could be adapted to monitor future pandemics, or even plant viruses in hydroponic systems.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.