SpliceSelectNet: A Hierarchical Transformer-Based Deep Learning Model for Splice Site Prediction

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Accurate RNA splicing is essential for gene expression and protein function, yet the mechanisms governing splice site recognition remain incompletely understood. Aberrant splicing caused by mutations can lead to severe diseases, including cancer and genetic disorders, underscoring the need for accurate computational tools to predict splice sites and detect disruptions. Existing methods have made significant advances in splice site prediction but are often limited in handling long-range dependencies, a factor critical to splicing regulation. Moreover, many models lack interpretability, hindering efforts to elucidate the underlying biological mechanisms. Here, we present SpliceSelectNet (SSNet), a novel deep-learning model that predicts splice sites directly from DNA sequences. This model is capable of handling long-range dependencies (up to 100 kb) using a hierarchical Transformer-based architecture with both local and global attention mechanisms. SSNet offers interpretability at the single nucleotide level, making it particularly effective for identifying aberrant splicing caused by mutations. Our model surpasses the state-of-the-art (SoTA) in splice site prediction on the Gencode test dataset and demonstrates superior performance in aberrant splicing prediction on the BRCA dataset and deep intronic dataset.

Article activity feed