PanTE: A Comprehensive Framework for Transposable Element Discovery in Graph-based Pangenomes
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Transposable element (TE) annotation is crucial for understanding genetics, genomics and evolution, yet current methods struggle to identify TEs in graph-based pangenomes. We developed a framework PanTE to construct accurate and representative TE libraries for both single genomes and graph pangenomes. PanTE is the first of its kind capable of being directly applied to graph-based pangenomes to build population-level TE libraries. By partially reimplementing RepeatModeler2 and integrating key innovations, including graph pangenome disassembly, alignment-free LTR structure detection, a machine learning-based classifier and efficiency-boosting strategies, PanTE outperformed RepeatModeler2 by efficiently handling large genomes, detecting high-abundance TEs and LTR-retrotransposons, and providing robust TE classification with superior computational efficiency. Compared to EDTA, it annotated ~ 26% more TEs in the grapevine genome and achieved up to 13 times faster runtimes in the wheat genome. PanTE represents a significant advancement in population-wide TE discovery, making it particularly valuable for pangenomic studies.