Explainable Deep-Learning on condition specific expression profiles reveals critical cytosines in gene regulation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Compared to other nucleotides, the cytosines stand as the most expressive one for gene regulation in plants due to its status as methylation-based epigenetic switch. Methylation of some of these cytosines may have higher impact on downstream genes, making them critical ones. To this date not much has been done to decipher the criticality of such cytosines. This is first such pioneering study in decoding the critical Cs where a large volume of bisulfite and RNA-seq data, including 232 WGBS and 260 corresponding RNA-seq datasets from A. thaliana and rice, have been utilized. Using a deep learning system, strong relationship was established between methylation states of cytosines in contextual manner with respect to the downstream genes expression levels. Using the same system, all the 2kb upstream regions for Arabidopsis and rice genes have been annotated for critical cytosines. Experimental validation demonstrated specific methylation changes at critical cytosines under heat stress, affecting gene expression. The universal method developed may be applied to annotate other plant genomes for critical cytosine identification. GC% similarity, rather than homology, explains gene regulatory behavior under methylation control. The resource, CritiCal-C portal, is publicly accessible, paving the way for revolutionary strategies in gene regulation and expression with minimal interventions.