LOCATE: using Long-read to Characterize All Transposable Elements

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Transposons constitute ~45% of the human genome, driving gene evolution and contributing to disease, but their repetitive nature complicates the identification of new insertions. We present LOCATE (Long-read to Characterize All Transposable Elements), an algorithm using long-read sequencing to detect and assemble transposon insertions. LOCATE outperforms existing tools on simulated datasets and achieves the best performance in two previous benchmarks, as well as in a new benchmark we constructed using real biological datasets. Applying LOCATE to public datasets revealed that pre-existing Alu copies create two hotspots for Alu and LINE1 insertions: the A-rich linker and the poly(A) tail. We further observed a preference for self-insertions over non-self-insertions in Alu and LINE1, suggesting a "feedforward" transposition mechanism in which Alu and LINE1 RNA transcripts target the hotspots of their source copies to generate new insertions. LOCATE enhances our ability to study transposons and their role in genome dynamics.

Article activity feed