Predicting mammalian hosts in which novel coronaviruses can be generated

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Novel pathogenic coronaviruses – including SARS-CoV and SARS-CoV-2 – arise by homologous recombination in a host cell 1,2 . This process requires a single host to be infected with more than one type of coronavirus, which recombine to form novel strains of virus with unique combinations of genetic material. Identifying possible sources of novel coronaviruses requires identifying hosts (termed recombination hosts) of more than one coronavirus type, in which recombination might occur. However, the majority of coronavirus-host interactions remain unknown, and therefore the vast majority of recombination hosts for coronaviruses cannot be identified. Here we show that there are 11.5-fold more coronavirus-host associations, and over 30-fold more potential SARS-CoV-2 recombination hosts, than have been observed to date. We show there are over 40-fold more host species with four or more different subgenera of coronaviruses. This underestimation of both number and novel coronavirus generation in wild and domesticated animals. Our results list specific high-risk hosts in which our model predicts homologous recombination could occur, our model identifies both wild and domesticated mammals including known important and understudied species. We recommend these species for coronavirus surveillance, as well as enforced separation in livestock markets and agriculture.

Article activity feed

  1. SciScore for 10.1101/2020.06.15.151845: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    This ‘no-preconceptions’ approach enables us to analyse without being restricted by our current incomplete knowledge of the specific biological and molecular mechanisms which govern host-virus permissibility Additionally, the incorporation of similarity-based learners in our three-perspective approach enabled us to capture new hosts (i.e. with no known association with coronaviruses), thus avoiding a main limitation of approaches which rely only on networks and their topology. We acknowledge certain limitations in our methodology, primarily pertaining to current incomplete datasets in the rapidly developing but still understudied field. 1) The inclusion only of coronaviruses for which complete genomes could be found limited the number of coronaviruses (species or strain) for which we could compute meaningful similarities, and therefore predict potential hosts. The same applies for our mammalian species – we only included mammalian hosts for which phylogenetic, ecological, and geospatial data were available. As more data on sequenced coronaviruses or mammals become available in future, our model can be re-run to further improve predictions, and to validate predictions from earlier iterations. 2) Research effort, centering mainly on coronaviruses found in humans and their domesticated animals, can lead to overestimation of the potential of coronaviruses to recombine in frequently studied mammals, such as lab rodents which were excluded from the results reported here (similar to...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.