A 2-Gene Host Signature for Improved Accuracy of COVID-19 Diagnosis Agnostic to Viral Variants

Abstract

In this work, we study upper respiratory tract gene expression to develop and validate a 2-gene host-based COVID-19 diagnostic classifier and then demonstrate its implementation in a clinically practical qPCR assay. We find that the host classifier has utility for mitigating false-negative results, for example due to SARS-CoV-2 variants harboring mutations at primer target sites, and for mitigating false-positive viral PCR results due to laboratory cross-contamination.

SciScore for 10.1101/2022.01.06.21268498: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	not detected.
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Because we did not have access to the underlying sequencing data, we used the gene counts originally generated by the authors using STAR alignment and the R function featureCounts.	STAR suggested: (STAR, RRID:SCR_004463) featureCounts suggested: (featureCounts, RRID:SCR_012919)
For each RNA-seq cohort, gene counts were subjected to the variance stabilizing transformation (VST) from the R package DESeq2 (v. 1.26.0) and the transformed values were then standardized (centered and scaled) to yield the final input …

SciScore for 10.1101/2022.01.06.21268498: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	not detected.
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Because we did not have access to the underlying sequencing data, we used the gene counts originally generated by the authors using STAR alignment and the R function featureCounts.	STAR suggested: (STAR, RRID:SCR_004463) featureCounts suggested: (featureCounts, RRID:SCR_012919)
For each RNA-seq cohort, gene counts were subjected to the variance stabilizing transformation (VST) from the R package DESeq2 (v. 1.26.0) and the transformed values were then standardized (centered and scaled) to yield the final input features.	DESeq2 suggested: (DESeq, RRID:SCR_000154)
RNA-seq SVM classifier development and validation: SVM learning was implemented in scikit-learn (https://scikit-learn.org) using the sklearn.svm.	scikit-learn suggested: (scikit-learn, RRID:SCR_002577)
RNA-seq differential expression: Gene expression fold-changes in each RNA-seq cohort between the COVID-19 and non-viral samples (Figure 1d) and between the COVID-19 and other viral samples (Figure 1e) were calculated with the R package limma (v. 3.42), using quantile normalization and the voom method.	limma suggested: (LIMMA, RRID:SCR_010943)

Results from OddPub: Thank you for sharing your code and data.

Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

Our study has some limitations. While our findings provide a framework for the rapid clinical translation of a host-based COVID-19 diagnostic, a randomized controlled trial of our assay will be needed to firmly establish its clinical utility. Our results suggest that addition of host targets is likely to improve diagnostic accuracy, however, a prospective assessment using clinically confirmed false-positive and false-negative viral tests is needed. Moreover, our classifier models were trained and tested on cohorts with particular characteristics, including the balance between COVID-19, other viral and non-viral samples; the mix of other respiratory viruses represented; and within the COVID-19 group, the distributions of viral load and of time since onset of infection. All these variables no doubt affect classifier performance and will vary in reality with time and place. However, the fact that our classifiers translated so well across diverse real-world cohorts argues that they are quite robust to these issues. While we did not explicitly explore it here, our results suggest that parsimonious host classifiers could serve not only as a COVID-19 diagnostic but also as a pan-respiratory virus surveillance tool. Even prior to the COVID-19 pandemic, viral lower respiratory tract infections were a leading cause of disease and death16, and many respiratory viral infections go undetected, leading to preventable transmission and unnecessary antibiotic treatment17. Since our classifier...

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

A 2-Gene Host Signature for Improved Accuracy of COVID-19 Diagnosis Agnostic to Viral Variants

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Host transcriptional profiling identifies B cell associated genes to be upregulated in individuals with asymptomatic COVID-19 and latent tuberculosis

Integrated Transcriptomic Analysis Reveals Distinct Immune Response Signatures and Prognostic Biomarkers in SARS-CoV-2 Infection

Assessing Mass Screening as an Effective Tool for Pandemic Management

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Host transcriptional profiling identifies B cell associated genes to be upregulated in individuals with asymptomatic COVID-19 and latent tuberculosis

Integrated Transcriptomic Analysis Reveals Distinct Immune Response Signatures and Prognostic Biomarkers in SARS-CoV-2 Infection

Assessing Mass Screening as an Effective Tool for Pandemic Management