Comprehensive Hierarchical Classification of Transposable Elements based on Deep Learning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Transposable elements (TEs) are DNA sequences capable of translocating within a genome. They constitute a substantial portion of eukaryotic genomes and play significant roles in genome evolution and gene regulation. The correct classification of these repetitive elements is essential to investigate their potential impact on genomes. Despite the existence of several tools for TE classification, they often neglect the importance of simultaneously utilizing global and local information for TE-type identification, resulting in suboptimal performance. Furthermore, these tools are not user-friendly due to the complex installation processes and numerous dependencies. In this study, we introduced a novel framework, CREATE, which leverages the strengths of C onvolutional and R ecurrent Neural N E tworks, combined with A ttention mechanisms, for efficient TE classification. Given the tree-like structure of TE groups, we separately trained nine models within the class hierarchy. Benchmarking experiments showed that CREATE significantly outperformed other TE classification tools. The source code and demo data for CREATE are available at https://github.com/yangqi-cs/CREATE . To facilitate TE annotation for researchers, we have developed a web platform, named WebDLTE, based on the CREATE framework. This platform employs GPU-accelerated pre-trained deep learning models for real-time TE classification and offers the most comprehensive collection of TEs for download. The web interface can be accessed at https://www.webdlte.nwpu.edu.cn .

Article activity feed