ImputeCoVNet: 2D ResNet Autoencoder for Imputation of SARS-CoV-2 Sequences

Abstract

We describe a new deep learning approach for the imputation of SARS-CoV-2 variants. Our model, ImputeCoVNet, consists of a 2D ResNet Autoencoder that aims at imputing missing genetic variants in SARS-CoV-2 sequences in an efficient manner. We show that ImputeCoVNet leads to accurate results at minor allele frequencies as low as 0.0001. When compared with an approach based on Hamming distance, ImputeCoVNet achieved comparable results with significantly less computation time. We also present the provision of geographical metadata (e.g., exposed country) to decoder increases the imputation accuracy. Additionally, by visualizing the embedding results of SARS-CoV-2 variants, we show that the trained encoder of ImputeCoVNet, or the embedded results from it, recapitulates viral clade’s information, which means it could be used for predictive tasks using virus sequence analysis.

SciScore for 10.1101/2021.08.13.456305: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Ethics	not detected.
Sex as a biological variable	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.

Table 2: Resources

No key resources detected.

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
No funding statement was detected.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

ImputeCoVNet: 2D ResNet Autoencoder for Imputation of SARS-CoV-2 Sequences

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Multi-View Autoencoder Framework with Feature Recalibration and Ensemble Learning for Predicting Heart Disease

<p class="MDPI12title"><a name="_Hlk215587133"></a>A Convolutional Autoencoder-Based Method for Vector Curve Data Compression

DCPM-ADMET: Fusion of Dual-channel Pre-trained Model and Molecular Fingerprints to enhance Drug ADMET Properties Prediction

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Multi-View Autoencoder Framework with Feature Recalibration and Ensemble Learning for Predicting Heart Disease

<p class="MDPI12title"><a name="_Hlk215587133"></a>A Convolutional Autoencoder-Based Method for Vector Curve Data Compression

DCPM-ADMET: Fusion of Dual-channel Pre-trained Model and Molecular Fingerprints to enhance Drug ADMET Properties Prediction