Targeted genomic sequencing with probe capture for discovery and surveillance of coronaviruses in bats

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This work applies hybrid-capture sequencing for coronavirus (CoV) surveillance in bats. Given that bats are a major reservoir for animal-to-human virus spillover events, which have caused several major epidemics/pandemics, this is a very important field of research. The reported hybrid-capture method shows some clear advantages over amplicon-based viral sequencing, which is the established standard in the field. This new approach has clear merits that are well supported by the data presented and is likely to become an important tool in viral surveillance programs that ultimately aim to predict/prevent/prepare for future pandemics. The work will be of interest to microbiologists, particularly those studying viruses or interested in genomics surveillance.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Public health emergencies like SARS, MERS, and COVID-19 have prioritized surveillance of zoonotic coronaviruses, resulting in extensive genomic characterization of coronavirus diversity in bats. Sequencing viral genomes directly from animal specimens remains a laboratory challenge, however, and most bat coronaviruses have been characterized solely by PCR amplification of small regions from the best-conserved gene. This has resulted in limited phylogenetic resolution and left viral genetic factors relevant to threat assessment undescribed. In this study, we evaluated whether a technique called hybridization probe capture can achieve more extensive genome recovery from surveillance specimens. Using a custom panel of 20,000 probes, we captured and sequenced coronavirus genomic material in 21 swab specimens collected from bats in the Democratic Republic of the Congo. For 15 of these specimens, probe capture recovered more genome sequence than had been previously generated with standard amplicon sequencing protocols, providing a median 6.1-fold improvement (ranging up to 69.1-fold). Probe capture data also identified five novel alpha- and betacoronaviruses in these specimens, and their full genomes were recovered with additional deep sequencing. Based on these experiences, we discuss how probe capture could be effectively operationalized alongside other sequencing technologies for high-throughput, genomics-based discovery and surveillance of bat coronaviruses.

Article activity feed

  1. Evaluation Summary:

    This work applies hybrid-capture sequencing for coronavirus (CoV) surveillance in bats. Given that bats are a major reservoir for animal-to-human virus spillover events, which have caused several major epidemics/pandemics, this is a very important field of research. The reported hybrid-capture method shows some clear advantages over amplicon-based viral sequencing, which is the established standard in the field. This new approach has clear merits that are well supported by the data presented and is likely to become an important tool in viral surveillance programs that ultimately aim to predict/prevent/prepare for future pandemics. The work will be of interest to microbiologists, particularly those studying viruses or interested in genomics surveillance.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

  2. Joint Public Review:

    Here the authors develop and evaluate a new hybrid-capture sequencing approach for coronavirus (CoV) surveillance in bats. The intended goal is to overcome limitations in amplicon sequencing, which is the current standard method for viral surveillance in animal species. Whereas amplicon sequencing is only suitable for targeted analysis of the highly conserved RdRp gene in bat CoVs, the new hybrid-capture approach affords a great breadth of coverage across the full genome in diverse CoV species. This promises to improve the identification and phylogenetic analysis of bat CoVs. The authors conclude by making practical recommendations about how their new method can be applied to usefully complement existing technologies in the field.

    The new method appears to suffer from a lower sensitivity for CoV detection than amplicon sequencing, and also struggles to yield complete sequences across the bat CoV spike protein, which is a highly divergent region. The authors have appropriately acknowledged these weaknesses, and show how other complementary tools can alleviate them - for example by using deep metagenome sequencing to resolve the spike protein in new CoV strains discovered through hybrid capture sequencing.

    This is an excellent paper in my opinion. The study addresses an important problem - improved methodologies for CoV viral surveillance in bats, a common source of zoonotic viral transmission events. The authors developed a new method that has obvious utility. They have fairly evaluated this method against existing approaches (targeted amplicon sequencing and deep metagenomic sequencing) using appropriate data. In addition to describing a useful new method, the study also produced some novel results that are likely valuable - that of complete (or near-complete) genome sequences for several novel bat coronaviruses. The authors discuss the outcomes in a fair and balanced fashion and make some simple, practical recommendations about how their new tool might best be used. Finally, the article was very well written; clear, concise, and fluent.

  3. SciScore for 10.1101/2022.04.25.489472: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Ethicsnot detected.
    Sex as a biological variablenot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    RNA concentration and RNA Integrity Number (RIN) for all RNA extracts were measured using the Agilent BioAnalyzer 2100 instrument with the RNA 6000 Nano kit.
    Agilent BioAnalyzer
    suggested: None
    Probe coverage of reference sequences was also assessed in silico using ProbeTools.
    ProbeTools
    suggested: None
    De novo assembly of contigs from captured reads: coronaSPAdes (v3.15.0) was used to assemble contigs de novo from probe captured MiSeq data [Meleshko 2021].
    MiSeq
    suggested: None
    CoV contigs were identified using BLASTn (v2.5.0) against a local database composed of all coronaviridae sequences in GenBank available as of October 11, 2021 [Camacho 2009].
    BLASTn
    suggested: (BLASTN, RRID:SCR_001598)
    Depth and extent of read coverage were determined with bedtools genomecov (v2.30.0) [Quinlan 2010].
    bedtools
    suggested: (BEDTools, RRID:SCR_006646)
    HiSeq reads were mapped to draft genomes using bwa mem (v0.7.17-r1188), then alignments were filtered, sorted, and indexed using samtools (v1.11) [Li 2009a, Li 2009b].
    samtools
    suggested: None
    Phylogenetic analysis of novel spike gene sequences: Novel spike genes were translated from complete genomes then queried against all translated coronaviridae spike sequences in GenBank using BLASTp (v2.5.0) [Camacho 2009].
    BLASTp
    suggested: (BLASTP, RRID:SCR_001010)
    For each genus, novel spike genes from study specimens were combined with the 25 closest-matching GenBank spike sequences and all spike sequences available in RefSeq.
    RefSeq
    suggested: None
    Multiple sequence alignments were conducted with clustalw (v2.1), then phylogenetic trees were constructed from aligned sequences using PhyML (v3.3.20190909) [Thompson 1994, Guindon 2005].
    PhyML
    suggested: (PhyML, RRID:SCR_014629)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    This study also revealed two important limitations for probe capture in CoV discovery and surveillance applications. The first, which appeared to be the most limiting in this study, is the in vitro sensitivity of this method. Probe capture must be performed on already constructed metagenomic sequencing libraries. The library construction process involves numerous sequential biochemical reactions and bead clean-ups, where inefficiencies result in compounding losses of input material. Combined with the low prevalence of viral genomic material in swab specimens, these loses of input material can lead to the presence of incomplete viral genomes in sequencing libraries and stochastic recovery during probe capture. Amplicon sequencing does not suffer the same attrition because enrichment occurs as the first step of the process, allowing library construction to occur on abundant amplicon input material. Further work optimizing metagenomic library construction protocols could be done to improve sensitivity for probe capture. Also, this study relied on archived material in suboptimal condition, so better results could be expected from fresh surveillance specimens. The second limitation highlighted by this work is the challenge of designing hybridization probes from available reference sequences for poorly characterized taxa. Currently, the extent of human knowledge about bat CoV diversity remains limited, especially across hypervariable genes like spike, and it seems impossible to des...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.