Shotgun transcriptome, spatial omics, and isothermal profiling of SARS-CoV-2 infection reveals unique host responses, viral diversification, and drug interactions

This article has been Reviewed by the following groups

Read the full article

Abstract

In less than nine months, the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) killed over a million people, including >25,000 in New York City (NYC) alone. The COVID-19 pandemic caused by SARS-CoV-2 highlights clinical needs to detect infection, track strain evolution, and identify biomarkers of disease course. To address these challenges, we designed a fast (30-minute) colorimetric test (LAMP) for SARS-CoV-2 infection from naso/oropharyngeal swabs and a large-scale shotgun metatranscriptomics platform (total-RNA-seq) for host, viral, and microbial profiling. We applied these methods to clinical specimens gathered from 669 patients in New York City during the first two months of the outbreak, yielding a broad molecular portrait of the emerging COVID-19 disease. We find significant enrichment of a NYC-distinctive clade of the virus (20C), as well as host responses in interferon, ACE, hematological, and olfaction pathways. In addition, we use 50,821 patient records to find that renin–angiotensin–aldosterone system inhibitors have a protective effect for severe COVID-19 outcomes, unlike similar drugs. Finally, spatial transcriptomic data from COVID-19 patient autopsy tissues reveal distinct ACE2 expression loci, with macrophage and neutrophil infiltration in the lungs. These findings can inform public health and may help develop and drive SARS-CoV-2 diagnostic, prevention, and treatment strategies.

Article activity feed

  1. SciScore for 10.1101/2020.04.20.048066: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board StatementConsent: Sample Collection and Processing: Patient specimens were collected with patients’ consent at New York Presbyterian Hospital-Weill Cornell Medical Center (NYPH-WCMC) and then processed for qRT-PCR.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    We also predicted a combined viral load score using Ct, GloMax QuantiFluor readout from LAMP experiments and fraction of SARS-CoV-2 matching NGS reads in a sample.
    LAMP
    suggested: (LAMP, RRID:SCR_001740)
    LAMP Primer Sequences: Primers were designed using PrimerExplorer (v4.0), as per guidelines in Zhang et al., 2020.
    PrimerExplorer
    suggested: None
    Statistical and graphical analysis were performed with GraphPad Prism 8.0.4.
    GraphPad Prism
    suggested: (GraphPad Prism, RRID:SCR_002798)
    Libraries were pooled and sent to the WCM Genomics Core or HudsonAlpha for final quantification by Qubit fluorometer (ThermoFisher Scientific), TapeStation 2200 (Agilent), and QRT-PCR using the Kapa Biosystems Illumina library quantification kit.
    Agilent
    suggested: (Agilent Bravo NGS, RRID:SCR_019473)
    Taxonomic Classification of Sequence Data: All complete genome or chromosome level assemblies from RefSeq database for archaea, bacteria, protozoa, fungi, human and viruses including SARS-CoV and SARS-CoV-2 genomes were downloaded and used for building a classification database for Kraken2 (k=35, ℓ=31) (O’Leary et al., 2016; Wood et al., 2019).
    RefSeq
    suggested: (RefSeq, RRID:SCR_003496)
    To get an approximation for the positive and negative classification rate, the BBMap random-reads script was used to simulate 10 million 150bp paired-end Illumina reads from the database sequences (Segata et al., 2016).
    BBMap
    suggested: (BBmap, RRID:SCR_016965)
    All sequences were classified using the Kraken2 database.
    Kraken2
    suggested: None
    Variants were called using iVar, and pileups and consensus sequences were generated using samtools (Li et al., 2009; Grubaugh et al., 2019; Greenfield et al., 2020).
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Variants were identified by enumerating the coordinates and query / reference subsequences associated with mismatches (SNV) and gaps in the query (deletion) and reference (insertions) using R/Bioconductor (GenomicRanges, Rsamtools, Biostrings packages) and Imielinski lab gChain packages (https://github.com/mskilab/gChain).
    GenomicRanges
    suggested: (GenomicRanges, RRID:SCR_000025)
    Biostrings
    suggested: (Biostrings, RRID:SCR_016949)
    Exhaustive variant calling on read alignments was additionally performed using bcftools mpileup and call, with variant read support (VAF, alternate allele count) enumerated with the R/Bioconductor Rsamtools package.
    R/Bioconductor
    suggested: None
    Rsamtools
    suggested: None
    Reads matching Homo sapiens were trimmed with TrimGalore, aligned with STAR (v2.6.1d) to the human reference build GRCh38 and the GENCODE v33 transcriptome reference, gene expression was quantified using featureCounts, stringTie and salmon using the nf-core RNAseq pipeline (Pertea et al., 2015; Malinen et al., 2005; Johnson et al., 2007; Robinson et al., 2010; Naccache et al., 2014; Zamani et al., 2017; Ewels et al., 2019).
    TrimGalore
    suggested: None
    STAR
    suggested: (STAR, RRID:SCR_015899)
    GENCODE
    suggested: (GENCODE, RRID:SCR_014966)
    featureCounts
    suggested: (featureCounts, RRID:SCR_012919)
    stringTie
    suggested: (StringTie , RRID:SCR_016323)
    Sample QC was reported using fastqc, RSeQC, qualimap, dupradar, Preseq and MultiQC (Okonechnikov et al., 2016; Andrews, 2015; Ewesl et al., 2016; Sayols et al., 2016; Wang et al., 2012).
    RSeQC
    suggested: (RSeQC, RRID:SCR_005275)
    qualimap
    suggested: (QualiMap, RRID:SCR_001209)
    MultiQC
    suggested: (MultiQC, RRID:SCR_014982)
    Reads, as reported by featureCounts, were normalized using variance-stabilizing transform (vst) in DESeq2 package in R for visualization purposes in log-scale (Love et al., 2014).
    DESeq2
    suggested: (DESeq, RRID:SCR_000154)
    In the first correction ciliated cell fraction (as predicted by MUSIC) was added as another covariate to our model.
    MUSIC
    suggested: (MuSiC, RRID:SCR_008792)
    Using the sequence names in the EUA template, the NCBI taxonomy database was queried to find the highest quality representative sequences for more detailed analysis.
    NCBI
    suggested: (NCBI, RRID:SCR_006472)
    Primers were compared to this database using Blast 2.8.1 and the following parameters (word size: 7, match score: 2, mismatch score: −3, gap open cost: 5, gap extend cost: 2).
    Blast
    suggested: (BLASTX, RRID:SCR_001653)
    Statistical and visualization software: All electronic health data analyses were performed in Python 3.7 and all models were fit using R 3.6.3.
    Python
    suggested: (IPython, RRID:SCR_001658)
    Additional statistical analyses, processing, transformation, and visualization of genomic data were completed in R / Bioconductor (‘Rsamtools’, ‘GenomicRanges’, ‘Biostrings’) and additional Imielinski Lab R packages (‘gTrack’, ‘gChain’, ‘gUtils’, ‘RSeqLib’) available at https://github.com/mskilab.
    Bioconductor
    suggested: (Bioconductor, RRID:SCR_006442)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    While hospital-grade, core lab devices can achieve massive throughput (thousands of samples per day), a key limitation of these assays is accessibility of testing facilities to patients, the logistics of sample transport, and timely test reporting. These limitations become even more stark in the context of widespread quarantines and nationwide lockdowns, where requiring patients to travel (even for viral testing) incurs significant personal and public health risks. The most urgent diagnostic need in this situation is for scalable rapid point-of-care tests that can be potentially implemented in the home. Our validation of a rapid one-tube, dual-primer colorimetric SARS-CoV-2 assay with both qRT-PCR and total RNA-seq provides a potential solution to this problem. Further work will be needed to assess whether this LAMP assay can detect the presence of SARS-CoV-2 at even lower (but clinically relevant) viral concentrations in specimen types that are less cumbersome to collect than naso/oropharyngeal swabs (e.g. saliva, stool). As we demonstrate, this LAMP SARS-CoV-2 assay can be also applied for environmental sampling, which may be crucial in the containment and recovery phases of this pandemic. Specifically, LAMP positivity may quickly indicate if an area is infectious and a negative result (with appropriate confirmation) will possibly represent a lower risk. Indeed, these tools and methods can help create a viral “weather report” if broadly used and partnered with continual val...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We found bar graphs of continuous data. We recommend replacing bar graphs with more informative graphics, as many different datasets can lead to the same bar graph. The actual data may suggest different conclusions from the summary statistics. For more information, please see Weissgerber et al (2015).


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.