Using image-based haplotype alignments to map global adaptation of SARS-CoV-2
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Quantifying evolutionary change among viral genomes is an important clinical device to track critical adaptations geographically and temporally. We built image-based haplotype-guided evolutionary inference (ImHapE) to quantify adaptations in expanding populations of non-recombining SARS-CoV-2 genomes. By combining classic population genetic summaries with image-based deep learning methods, we show that different rates of positive selection are driving evolutionary fitness and dispersal of SARS-CoV-2 globally. A 1.35-fold increase in evolutionary fitness is observed within the UK, associated with expansion of both the B.1.177 and B.1.1.7 SARS-CoV-2 lineages.
Article activity feed
-
SciScore for 10.1101/2021.01.13.426571: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources We note that, although we integrate a custom simulation framework into our pipeline, any simulation output can be trained using the CNN if the sequence alignments are converted to binary-encoded NumPy files. NumPysuggested: (NumPy, RRID:SCR_008633)We then aligned each sequence to the Wuhan reference genome (NCBI RefSeq: NC_045512) using a Needleman-Wunsch rapid global alignment implemented in EMBOSS stretcher (default settings). EMBOSSsuggested: (EMBOSS, RRID:SCR_008493)All data processing, visualization, and analysis was performed using python v3.6.0 or R v4.0.3. pythonsuggested: (IPython, …SciScore for 10.1101/2021.01.13.426571: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources We note that, although we integrate a custom simulation framework into our pipeline, any simulation output can be trained using the CNN if the sequence alignments are converted to binary-encoded NumPy files. NumPysuggested: (NumPy, RRID:SCR_008633)We then aligned each sequence to the Wuhan reference genome (NCBI RefSeq: NC_045512) using a Needleman-Wunsch rapid global alignment implemented in EMBOSS stretcher (default settings). EMBOSSsuggested: (EMBOSS, RRID:SCR_008493)All data processing, visualization, and analysis was performed using python v3.6.0 or R v4.0.3. pythonsuggested: (IPython, RRID:SCR_001658)All machine learning models were implemented in Keras using tensorflow v2.3.0 or scikit-learn v0.24.0. tensorflowsuggested: (tensorflow, RRID:SCR_016345)scikit-learnsuggested: (scikit-learn, RRID:SCR_002577)Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-