Predicting Future SARS-CoV-2 Mutations using Deep Learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
SARS-CoV-2 continues to spread over the world steadily as opposed to many earlier estimations that it would disappear in less than two years. Even though SARS-CoV-2 vaccines have reduced the speed of the infection significantly, they could not fully stop it. On the contrary, the World Health Organization has recently published cautionary statements that infection counts are on the rise, and a huge wave is expected in winter. Vaccines mostly target specific regions of the virus. The high mutation rate of SARS-CoV-2 is one essential tool that the virus exploits to escape from the available vaccines. Therefore, researchers have been working on designing next-generation vaccines against the new variants of the virus. Nevertheless, SARS-CoV-2 acquires new mutations faster than we can adapt our vaccines due to long clinical trial periods. Hence, there is a need for computational tools that can predict future SARS-CoV-2 mutations before they even emerge. In this paper, we propose several deep-learning-based methods to estimate the possible future mutations in SARS-CoV-2 genome. We design and evaluate various ensemble and bagging architectures enriched with a large set of genomic, biochemical, and phylogenetic features. We evaluate our models on the GISAID data and demonstrate that the best-performing methods achieve an F1-Macro score of 0.78.