Nanopore Direct RNA-Seq Reveals Widespread and Predictable Non-Coding Transcriptional Variations in DNA Methylation-deficient Arabidopsis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Long non-coding RNAs are regulatory RNAs with significantly high expression diversity within populations. In both plant and animal genomes, expression of long intergenic non-coding RNAs (lincRNAs) is associated with reduced genomic DNA methylation in non-coding regions. Whether such newly-activated lincRNAs are widespread in natural populations and have predictable patterns associated with natural selection has not been investigated. Here, we employed Oxford Nanopore Technology Direct RNA and DNA sequencing (ONT DRS and DDS) in DNA methylation-deficient Arabidopsis mutants ddm1 and met1 to generate 41 million high-quality long RNA reads. In total, 340 lincRNAs were found to be activated under defective DNA methylation, while 209 were constitutively expressed in both mutants and WT. Expression of ddm1 -activated lincRNAs was negatively correlated with DNA methylation levels. DNA-hypomethylation-activated lincRNAs can be detected in natural populations at low frequency and with high expression diversity. The ddm1 -activated and non- ddm1 -activated lincRNAs can be distinguished by Random Forest algorithm and Deep convolutional neural network with up to 70% and 91% accuracy. The integrated results suggest that dynamics of DNA methylation in non-coding regions are associated with non-coding RNAs in a predictable fashion with potential new function. This work advances a framework for leveraging epigenomic signatures in non-coding RNA discovery.