The GC-content at the 5’ends of human protein-coding genes is undergoing mutational decay

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In vertebrates, most protein-coding genes have a peak of GC-content near their 5’ transcriptional start site (TSS). This feature promotes both the efficient nuclear export and translation of mRNAs. Despite the importance of GC-content for RNA metabolism, its general features, origin, and maintenance remain mysterious. We investigated the evolutionary forces shaping GC-content at the transcriptional start site (TSS) of genes through both comparative genomic analysis of nucleotide substitution rates between different species and by examining human de novo mutations. Our data suggests that GC-peaks at TSSs were present in the last vertebrate common ancestor and are largely dictated by recombination patterns. We observe that in primates and rodents, where recombination is directed away from TSSs by PRDM9, GC-content at protein-coding gene TSSs is currently undergoing mutational decay. In canids, which lack PRDM9 and perform recombination at TSSs, GC-content at protein-coding gene TSSs is increasing. These patterns extend into the open reading frame affecting protein-coding regions, and we show that changes in GC-content due to recombination affect synonymous codon position choices at the start of the open reading frame. Our results indicate that although high GC-content in protein-coding genes may be shaped by selective pressures to enhance expression, the dynamics of GC-content in mammals are largely shaped by patterns of recombination.

Article activity feed