Optimising predictive models to prioritise viral discovery in zoonotic reservoirs

Version published to 10.1016/s2666-5247(21)00245-7

Aug 1, 2022

Miranda Welsh

Review 2: "Optimizing Predictive Models to Prioritize Viral Discovery in Zoonotic Reservoirs"

Reviewers find this a rigorous and well-supported approach to target future studies of betacoronaviruses in bats, and raise a few questions (about the model and the data used) for further clarification.

Read the original source

SciScore for 10.1101/2020.05.22.111344: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	IRB: The HP3 dataset consists of 2,805 associations between 754 mammal hosts and 586 virus species, compiled from the International Committee on Taxonomy of Viruses (
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Data were sorted using a Python script that saved all available metadata regarding accession number, division, submission date, entry title, organism, genus, genome length, host classification, country, collection date, PubMed ID, journal containing associated publication, publication year, genome completeness, and the gene …

SciScore for 10.1101/2020.05.22.111344: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	IRB: The HP3 dataset consists of 2,805 associations between 754 mammal hosts and 586 virus species, compiled from the International Committee on Taxonomy of Viruses (
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Data were sorted using a Python script that saved all available metadata regarding accession number, division, submission date, entry title, organism, genus, genome length, host classification, country, collection date, PubMed ID, journal containing associated publication, publication year, genome completeness, and the gene sequenced.	Python suggested: (IPython, RRID:SCR_001658) PubMed suggested: (PubMed, RRID:SCR_004846)

Results from OddPub: Thank you for sharing your code and data.

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

Several limitations apply to our work, most notably the difficulty of empirically verifying predictions. Although some virological studies have incidentally tested specific hypotheses (e.g., filovirus models and bat surveys27,56, henipavirus models and experimental infections23,57), model-based predictions are nearly never subject to systematic verification or post-hoc efforts to identify and correct spurious results. Greater dialogue between modelers and empiricists is necessary to systematically confront the growing set of predicted host-virus associations with experimental validation or field observation. Already, we have identified roughly a half-dozen new bat reservoirs of betacoronaviruses that our study correctly predicted. Scotophilus heathii, Hipposideros larvatus, and Pteropus lylei, all highly predicted bat species in our out-of-sample rankings, have been reported positive for betacoronaviruses in the literature43,58; however, resulting sequences were not annotated to genus level in GenBank. More recently, the release of 630 novel bat coronaviruses sequences from China included four additional bat hosts (Hipposideros pomona, Scotophilus kuhlii, Myotis pequinius, and M. horsfieldii)59. All but M. horsfieldii were correctly identified by the ensemble, and three models correctly identified all seven verification hosts (Trait-based 1 and 3 and Network-based 1). These results support the idea that our models identified relevant targets correctly but also highlight an ev...

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Read the original source

Optimising predictive models to prioritise viral discovery in zoonotic reservoirs

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Miranda Welsh

Linking hosts, landscapes, and climate to advance zoonotic arbovirus forecasting

An AI for an AI: identifying zoonotic potential of avian influenza viruses via genomic machine learning

Phylogeographic analysis of Influenza D virus evolution

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Miranda Welsh

Related articles

Linking hosts, landscapes, and climate to advance zoonotic arbovirus forecasting

An AI for an AI: identifying zoonotic potential of avian influenza viruses via genomic machine learning

Phylogeographic analysis of Influenza D virus evolution