LncRAnalyzer: Uncovering Long Non-Coding RNAs and Exploring Gene Co-expression Patterns in Sorghum Genomics
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Sorghum ( Sorghum bicolor L. Moench ) is a versatile crop with significant phenotypic and genetic diversity. Despite the availability of multiple re-sequenced sorghum genomes, non-coding regions remain underexplored. Long non-coding RNAs (lncRNAs), a major class of non-coding RNAs, with low expression levels, complex expression patterns, and can be identified using RNA-seq. However, distinguishing lncRNAs from protein-coding genes is challenging due to their low abundance and tissue-specific expressions. This study developed an automated pipeline for identifying lncRNAs and Novel Protein Coding Transcripts (NPCTs) using RNA-seq datasets. Using a dual sorghum reference genome scheme, we identified 8770 lncRNAs in BTX642 (33.35% cultivar-specific) and 8869 lncRNAs in RTX430 (30.72% cultivar-specific). Some lncRNAs were linked to pre- and post-flowering drought tolerance in RTX430 and BTX642, respectively. The NPCTs encoded elements such as RE-1, TNT, Zinc finger (zf) protein, FAR1 related, and Xylan glucosyltransferase suggesting their involvement in drought-specific phenotype development. Target predictions for lncRNAs revealed both cis- and trans-regulated genes were associated with peroxidase, glutathione S-transferase, and MYB TFs during drought, emphasizing their role in drought tolerance. Differentially expressed lncRNAs in leaf and root tissues (p-value < 0.05, |log2fold| > 2), revealed that the most abundant lncRNAs during drought regulate TFs such as C2H2-GATA, DUF domain, and WD-40. Downstream gene co-expression analysis showed time point-specific lncRNA interactions with TFs, highlighting their roles in drought response in sorghum. Our developed pipeline LncRAnalyzer identified lncRNAs and NPCTs, and integrated them into gene co-expression networks provided deeper insights into drought tolerance in sorghum.