The genetic control of rapid genome content divergence in Arabidopsis thaliana
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Genome evolution in eukaryotes is predominantly driven by the dynamics of repetitive sequences, which vary widely in both copy number and sequence composition. The rate of repeat evolution changes between species and within a species and is likely modulated by both genetics and environment. To uncover the factors shaping the rate of genome content evolution, we analyzed 1,142 resequenced Arabidopsis thaliana genomes using a novel K-mer based approach. With this dataset, we characterized genome content variation and identified hypervariable regions that contribute to major differences in repeat abundance. We then treated repeat abundance as a quantitative trait and performed genome-wide association studies to map the genetic basis of copy number variation across more than 400 repeat families. We jointly analyzed these results using a meta-GWAS approach, revealing both cis-acting variants and over 50 trans-acting loci that regulate repeat abundance genome-wide. Finally, we found that purifying selection acts against mutations that increase the rate of genome content divergence, favoring alleles that limit repeat expansion. Together, our results provide new insights into the genetic architecture and evolutionary forces shaping genome evolution in plants.