A 2-Gene Host Signature for Improved Accuracy of COVID-19 Diagnosis Agnostic to Viral Variants

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

In this work, we study upper respiratory tract gene expression to develop and validate a 2-gene host-based COVID-19 diagnostic classifier and then demonstrate its implementation in a clinically practical qPCR assay. We find that the host classifier has utility for mitigating false-negative results, for example due to SARS-CoV-2 variants harboring mutations at primer target sites, and for mitigating false-positive viral PCR results due to laboratory cross-contamination.

Article activity feed

  1. SciScore for 10.1101/2022.01.06.21268498: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Ethicsnot detected.
    Sex as a biological variablenot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Because we did not have access to the underlying sequencing data, we used the gene counts originally generated by the authors using STAR alignment and the R function featureCounts.
    STAR
    suggested: (STAR, RRID:SCR_004463)
    featureCounts
    suggested: (featureCounts, RRID:SCR_012919)
    For each RNA-seq cohort, gene counts were subjected to the variance stabilizing transformation (VST) from the R package DESeq2 (v. 1.26.0) and the transformed values were then standardized (centered and scaled) to yield the final input features.
    DESeq2
    suggested: (DESeq, RRID:SCR_000154)
    RNA-seq SVM classifier development and validation: SVM learning was implemented in scikit-learn (https://scikit-learn.org) using the sklearn.svm.
    scikit-learn
    suggested: (scikit-learn, RRID:SCR_002577)
    RNA-seq differential expression: Gene expression fold-changes in each RNA-seq cohort between the COVID-19 and non-viral samples (Figure 1d) and between the COVID-19 and other viral samples (Figure 1e) were calculated with the R package limma (v. 3.42), using quantile normalization and the voom method.
    limma
    suggested: (LIMMA, RRID:SCR_010943)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our study has some limitations. While our findings provide a framework for the rapid clinical translation of a host-based COVID-19 diagnostic, a randomized controlled trial of our assay will be needed to firmly establish its clinical utility. Our results suggest that addition of host targets is likely to improve diagnostic accuracy, however, a prospective assessment using clinically confirmed false-positive and false-negative viral tests is needed. Moreover, our classifier models were trained and tested on cohorts with particular characteristics, including the balance between COVID-19, other viral and non-viral samples; the mix of other respiratory viruses represented; and within the COVID-19 group, the distributions of viral load and of time since onset of infection. All these variables no doubt affect classifier performance and will vary in reality with time and place. However, the fact that our classifiers translated so well across diverse real-world cohorts argues that they are quite robust to these issues. While we did not explicitly explore it here, our results suggest that parsimonious host classifiers could serve not only as a COVID-19 diagnostic but also as a pan-respiratory virus surveillance tool. Even prior to the COVID-19 pandemic, viral lower respiratory tract infections were a leading cause of disease and death16, and many respiratory viral infections go undetected, leading to preventable transmission and unnecessary antibiotic treatment17. Since our classifier...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.