Using Deep Learning with Different Architectures to Recognize RNA:DNA Triplex Structures from Histone Modification Features

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Long non-coding RNAs (lncRNAs) can perform their regulatory roles by forming triple helices through RNA-DNA interaction. Although this has been verified by few in vivo and in vitro methods, in silico approaches that seek to predict the potentials of lncRNAs and DNA sites becoming a triplex forming structure is required. Triplexator have also predicted vast amounts of lncRNAs and DNA sites that has the potentials of becoming a triplex structure. There is also an emerging experimental-evidence that the presence of epigenetic marks at DNA sites and lncRNAs can facilitate the formation of RNA:DNA triplex structures. There is therefore, a huge demand for computati onal approaches such as deep learning that can make novel predictions about RNA:DNA triplex structure formation. In this study, we developed four (4) deep neural network models that can predict the potentials of lncRNAs and DNA sites to form triple helices genome-wide, by taking histone modification marks as features. Our data was first passed through the Triplexator to screen out lncRNAs and DNA sites with low potentials of forming triple helices. We used different deep learning architectures to build our models, including two-layer convolutional neural networks (CNN) and multilayer perceptron (MLP). Our DNA2_CNN model performed best at a mean AUC of 0.78 at 32 Kernel size and learning rate of 1e-3. Our deep neural network models revealed several novel lncRNAs and DNA sites, including HOTAIR, MEG3, PARTICLE, DACOR1, MIR100HG, FENDRR, ANRIL, TUG1, MALAT1, LINC00599, TINCR, NEAT1, roX2, DHFR, OTX2-AS1, Xist, SNHG16, ATXN8OS, BCYRN1, TERC, Khps1, that have the potential of forming triplex structures, thereby confirming previous experimental results and that of the Triplexator. The performance of our models also supports previous findings that histone modification marks can help in identifying lncRNAs and DNA regions that have the potentials of forming RNA:DNA triplex structures. In conclusion, we showed that different deep learning architectures can recognize lncRNAs and DNA that have the potentials of forming RNA:DNA triplex structures.

Article activity feed