Programmed DNA variability as a general evolutionary principle: insights from analysis of PE_PGRS genes in Mycobacterium tuberculosis
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The PE_PGRS gene family in Mycobacterium tuberculosis (Mtb) exhibits extensive sequence variability across genotypes, supporting antigenic divergence. Here we investigate how Mtb—despite lacking horizontal gene transfer—balances genomic stability with adaptive plasticity. Comparative analysis of 88 bacterial genomes reveals that PE_PGRS genes are structurally optimized for mutability: they are enriched in sterically active tetramers such as CGGC (1.7–7.4% against a genome average of 1.62%) and depleted in out-of-frame stop codons, conferring robustness to 1-nt and 2-nt frameshifts. CGGC motifs are predicted to promote secondary DNA structures that destabilize replication and lead to replication errors, while the low abundance of out-of-frame stop codons allows continued translation beyond frameshifts, generating abrupt changes in protein sequence and length. This dual organization may underlie the extraordinary adaptability of Mtb and highlight a broader principle by which pathogens evolve under strong constraints on horizontal gene transfer. We propose that CGGC-rich regions function as universal programmed mutational hotspots across a wide range of microorganisms.