Machine learning based pan-plant analyses of transposable elements across 352 species illuminates genome evolution
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Transposable elements (TEs), nature’s genetic engineers’, are pivotal drivers of genome evolution, yet their precise mechanisms in shaping plant functional innovation remain elusive. This study presents a comprehensive analysis of TEs across 558 high-quality plant genomes, encompassing 352 species from 221 genera across five phyla, ranging from algae to angiosperms. We identified over 460 million TEs and 67 million transposase domains, systematically assessing their impact on host genomes through gene domestication, noncoding RNA generation, and gene duplication. Our analysis revealed 1,258,230 genes domesticated from TEs, 1,165,059 ncRNAs originating from TEs, and 10,488,967 TE-induced gene duplications. These genes affect more than 2,805 function families, likely planning crucial roles at key stages of plant evolution. Using a machine learning-based framework, we uncovered 1,536 lineage-specific functional gene families significantly influenced by TEs, with enzymes and transcription factors being predominant. Notably, we elucidated the role of TEs in expanding transcription factor gene families and in facilitating potential horizontal gene transfer of synthase gene families. This study provides unprecedented insights into TE-driven plant evolution, demonstrating how TEs contributes to key innovations at various evolutionary stages. Our finding not only enhance understanding of plant genome dynamics but also offer valuable resources for crop improvement and synthetic biology, illumination both current knowledge and future potential of evolutionary processes.