Forecasting the Species of Major Contagious Viruses Through Deep Learning and Data Wrangling Methodology
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In the context of viral outbreaks, timely identification of viral species plays a crucial role in aiding researchers and healthcare profes- sionals to accelerate vaccine development and formulate targeted treat- ments.This paper presents a hybrid deep learning approach for accurate viral species classification based on genomic sequences. We have made use of a combination of convolutional neural networks (CNN) and long short-term memory (LSTM) networks in our model to efficiently extract both local patterns and global dependencies from viral genomic data. We used several levels of genomic information, such as k-mer representations, GC content, nucleotide frequency analysis, and genome length. We have utilized bioinformatics software and data wrangling methods to build a comprehensive dataset from the National Center for Biotechnology In- formation and GenBank database. The model achieved an accuracy of 98.46\% (Table 1) in classifying viral species, which is much better than conventional machine learning methods. This study provides a compu- tationally efficient approach for the analysis of viral genomic data and provides valuable insight into the prediction of viral evolution. It has the potential to assist in quick response actions in the event of disease outbreaks. The unique genomic signatures identified by the model can give rise to new antiviral strategies and applications for viral genome manipulation.