Decoding RNA Triple Helices: Identification from Sequence and Secondary Structure

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The discovery of long non-coding RNAs has revealed additional layers of gene‑expression control. Specific interactions of lncRNAs with DNA, RNAs, and RNA-binding proteins enable regulation in both cytoplasmic and nuclear compartments; for example, a conserved triple‑helix motif is essential for MALAT1 stability and oncogenic activity. Here we present a secondary-structure-based framework to annotate and detect RNA triple helices. First, we extend the dot-bracket formalism with a third annotation line that encodes Hoogsteen contacts. Second, we introduce TripleMatcher, which searches for a tripl-helix pattern, filters candidates by C1'-C1' distance thresholds, and merges overlaps into region-level zones. Using telomerase RNAs and RNA-stability elements with experimentally established triple helices (8 RNAs), TripleMatcher localized all annotated regions (structure-wise detection 8/8); geometric filtering removed most spurious candidates and improved precision (PPV from 0.42 to 0.81) and overall accuracy (F_1 from 0.42 to 0.62) while maintaining sensitivity. Benchmarking eight predictors showed that pseudoknot-aware methods most reliably reproduce the local architecture required for detection, aligning secondary-structure quality with downstream triple-helix recovery. Applied prospectively, the framework identified candidate regions directly from predicted secondary structures and scaled to a screen of 4,147 RNAs, where distance filtering reduced 150,948 raw candidates to 90 geometrically feasible regions across seven molecules, including human telomerase complexes. Together, the notation and TripleMatcher provide a concise route from secondary structure to a small, interpretable set of triple-helix candidates suitable for targeted experimental validation.

Article activity feed