Decoding RNA Triple Helices: Identification from Sequence and Secondary Structure

Margherita A.G. Matarrese
Michela Quadrini
Nicole Luchetti
Federico Di Petta
Daniele Durante
Monica Ballarino
Letizia Chiodo
Luca Tesei

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The discovery of long non-coding RNAs has revealed additional layers of gene-expression control. Specific interactions of lncRNAs with DNA, RNAs, and RNA-binding proteins enable regulation in both cytoplasmic and nuclear compartments; for example, a conserved triple-helix motif is essential for MALAT1 stability and oncogenic activity. Here we present a secondary-structure–based framework to annotate and detect RNA triple helices. First, we extend the dot–bracket formalism with a third annotation line that encodes Hoogsteen contacts. Second, we introduce TripleMatcher, which searches for a triple-helix pattern, filters candidates by C1 ^′ –C1 ^′ distance thresholds, and merges overlaps into region-level zones. Using telomerase RNAs and RNA-stability elements with experimentally established triple helices (8 RNAs), TripleMatcher localized all annotated regions (structure-wise detection 8/8); geometric filtering removed most spurious candidates and improved precision (PPV from 0.42 to 0.81) and overall accuracy (F ₁ from 0.42 to 0.62) while maintaining sensitivity. Benchmarking eight predictors showed that pseudoknot-aware methods most reliably reproduce the local architecture required for detection, aligning secondary-structure quality with downstream triple-helix recovery. Applied prospectively, the framework identified candidate regions directly from predicted secondary structures and scaled to a screen of 4,147 RNAs, where distance filtering reduced 150,948 raw candidates to 90 geometrically feasible regions across seven molecules, including human telomerase complexes. Together, the notation and TripleMatcher provide a concise route from secondary structure to a small, interpretable set of triple-helix candidates suitable for targeted experimental validation.

Version published to 10.1101/2025.10.01.679706 on bioRxiv
Oct 3, 2025

A retroelement-derived mammalian ARC protein exhibits selective RNA recognition and nucleic acid chaperone functions

This article has 6 authors:
1. Julita Gumna-Mikina
2. Angelika Andrzejewska-Romanowska
3. Maciej Antczak
4. Ewa Tykwińska
5. Marta Szachniuk
6. Katarzyna Pachulska-Wieczorek
This article has no evaluationsLatest version Jan 27, 2026
Integrative Analysis Reveals Conserved R-Loop Features in Mouse Embryonic Stem Cells

This article has 12 authors:
1. Ohbeom Kwon
2. Hyeonwoo La
3. Seonho Yoo
4. Hyeonji Lee
5. Heeji Lee
6. Hoseong Lim
7. Chanhyeok Park
8. Dong Wook Han
9. Jeong Tae Do
10. Hyuk Song
11. Youngsok Choi
12. Kwonho Hong
This article has no evaluationsLatest version Jan 4, 2026
Benchmarking Reveals the Superiority of Nucleic Acid Foundation Models in Predicting lncRNA Coding Potential

This article has 5 authors:
1. Yu Yang
2. Liping Ren
3. Juan Feng
4. Yang Zhang
5. Tianyuan Liu
This article has no evaluationsLatest version Dec 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A retroelement-derived mammalian ARC protein exhibits selective RNA recognition and nucleic acid chaperone functions

Integrative Analysis Reveals Conserved R-Loop Features in Mouse Embryonic Stem Cells

Benchmarking Reveals the Superiority of Nucleic Acid Foundation Models in Predicting lncRNA Coding Potential