Substitution spectrum and selection at G-quadruplexes in great ape telomere-to-telomere genomes

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

G-quadruplexes (G4s) are noncanonical DNA secondary structures formed by runs of guanines (stems) connected by other nucleotides (loops). These structures are enriched at regulatory regions such as promoters, CpG islands, untranslated regions (UTRs), enhancers, and replication origins, where they play key roles in transcription and replication. Although prior studies have demonstrated that G4s exhibit higher mutation rates than canonical DNA, little is known about the substitution patterns and selection acting specifically on G4 stems and loops. In this study, we utilized Telomere-to-Telomere (T2T) genome assemblies from human and two non-human great apes (chimpanzee and Bornean orangutan) to analyze substitution spectra and selective constraints within G4s, focusing on differences between stems and loops. We observed that fixed nucleotide substitutions leading to the gain or loss of G4 structures are more frequently located at stems, while those in G4s conserved across species are more often found at loops. On the other hand, single nucleotide polymorphisms had higher frequencies at stems than loops for all G4s, with a particularly high difference for singleton polymorphisms, suggesting higher mutation rates at stems than loops. To evaluate selection, we employed two approaches: we computed the ratio of substitution to polymorphism frequencies at stems vs. loops and performed phylogenetic modeling using PhyloFit. Both methods consistently revealed that stems of shared G4s experience stronger purifying selection than loops, particularly at promoters, enhancers, and UTRs. Our results provide novel insights into the sequence variation and selection of G4s, informing our understanding of their contributions to genome evolution and function.

Significance Statement

G-quadruplexes (G4s) are non-canonical DNA structures that influence transcription, genome stability, and epigenetic regulation, yet their evolutionary dynamics in primates remain poorly understood. Leveraging recent T2T genome assemblies, we conducted a sequence-level analysis of G4 evolution across three ape lineages and used two methods to infer selection in G4. Shared G4s and species-specific G4s display distinct evolutionary signatures. Using PhyloFit to estimate substitution rates and the Hudson–Kreitman–Aguadé test to contrast divergence with polymorphism, we found stems under markedly stronger purifying selection than loops, especially within promoters, CpG islands, and 5′UTRs. This pattern indicates that maintaining stem integrity is functionally critical and evolutionarily conserved. Our findings reveal how selective constraints vary both within G4 motifs and across genomic landscapes, offering insights for future studies on their functional importance and structural stability.

Article activity feed