Motif selection enables efficient sequence-based classification of non-coding RNA

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation: Non-coding RNAs (ncRNAs) classification is important for genome annotation and to perform functional analyses of these biological molecules. Efficient methods for large-scale RNA classification remain challenging. Existing methods often rely on structure similarity, and they require huge computing times because they use secondary structure information, which impedes any large scale use. Results: We present a sequence-based method that relies on the computation and the selection of common sequence motifs to provide a set of features for effectively classifying ncRNAs families by a supervised learning approach. The results show that our method achieves an equal or higher accuracy than existing structure-based methods and drastically reduces required computing times. Results also demonstrate that, thanks to an appropriate selection of local sequence motifs, an efficient sequencebased ncRNA classification can be achieved through supervised learning.

Article activity feed