Constructing a multiple-layer interactome for SARS-CoV-2 in the context of lung disease: Linking the virus with human genes and co-infecting microbes

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has caused millions of deaths worldwide. Many efforts have focused on unraveling the mechanism of the viral infection to develop effective strategies for treatment and prevention. Previous studies have provided some clarity on the protein-protein interaction linkages occurring during the life cycle of viral infection; however, we lack a complete understanding of the full interactome, comprising human miRNAs and protein-coding genes and co-infecting microbes. To comprehensively determine this, we developed a statistical modeling method using latent Dirichlet allocation (called MLCrosstalk, for multiple-layer crosstalk) to fuse many types of data to construct the full interactome of SARS-CoV-2. Specifically, MLCrosstalk is able to integrate samples with multiple layers of information (e.g., miRNA and microbes), enforce a consistent topic distribution on all data types, and infer individual-level linkages (i.e., differing between patients). We also implement a secondary refinement with network propagation to allow our microbe-gene linkages to address larger network structures (e.g., pathways). Using MLCrosstalk, we generated a list of genes and microbes linked to SARS-CoV-2. Interestingly, we found that two of the identified microbes, Rothia mucilaginosa and Prevotella melaninogenica, show distinct patterns representing synergistic and antagonistic relationships with the virus, respectively. We also identified several SARS-COV-2-associated pathways, including the VEGFA-VEGFR2 and immune response pathways, which may provide potential targets for drug design.

Article activity feed

  1. SciScore for 10.1101/2021.12.05.471290: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The transcriptome data were analyzed using the exceRpt pipeline.
    exceRpt
    suggested: None
    Briefly, RNA-seq reads were subjected to quality assessment using FastQC software v.
    FastQC
    suggested: (FastQC, RRID:SCR_014583)
    0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit/).
    http://hannonlab.cshl.edu/fastx_toolkit/
    suggested: (FASTX-Toolkit, RRID:SCR_005534)
    Clipped, collapsed reads were filtered through the Univec database of common laboratory contaminants and a human ribosomal database before mapping to the human reference genome (hg19) and pre-miRNA sequences using STAR [50].
    STAR
    suggested: (STAR, RRID:SCR_004463)
    Pathway integration and curation: We used the Pathwaycommon v12 all-database version as a base, and then integrated the latest online version of KEGG (July 16, 2021) and Reactome (July 3, 2021) to output all the gene pair lists.
    KEGG
    suggested: (KEGG, RRID:SCR_012773)
    We also combined the pathway information from WikiPathways (May 10, 2021) and gene symbols from the HUGO Gene Nomenclature Committee with the gene pair list.
    WikiPathways
    suggested: (WikiPathways, RRID:SCR_002134)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.