Strand inequality in terms of expression-dependent synonymous single-nucleotide polymorphism in the Escherichia coli chromosome
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Mutation in genomes is mainly attributed to replication, though transcription is also known to be mutagenic and is a more frequent event than replication in an organism. Recently, there have been reports regarding genome-wide transcription-induced mutagenesis. However, a distinct demonstration of specific mutation being replication-dependent and/or transcription-dependent in genomes is yet to be established. Here, we studied synonymous single-nucleotide polymorphisms (SNPs) in 2091 individual coding sequences (CDS) in the leading strand (LeS) and the lagging strand (LaS) of the Escherichia coli chromosome by comparing across 157 strains. The frequencies of complementary transitions ( ti ) and complementary transversions ( tv ) were compared in each CDS to assess parity violation in the context of strand location and gene expression. The C→T and G→A exhibited the maximum frequency as well as the most prominent strand inequality as these tis were influenced both by the strand location as well as by the gene expression. Interestingly, inequality between T→C and A→G was expression-dependent but strand-independent. This was a direct demonstration of strand inequality due to expression but not due to replication. A→T and G→T tv s were universally more frequent than their complementary T→A and C→A, respectively. The strand-independent but expression-dependent synonymous SNP inequality in CDS, supports the role of transcription-induced mutagenesis contributing to strand inequality in the E. coli chromosome.
Significance Statement
Mutational patterns in bacterial genomes are shaped by both replication and transcription, but their relative roles remain unclear. By analyzing synonymous SNPs across 157 strains of E. coli , we identify strong strand-specific asymmetry consistent with replication-associated biases. A strong asymmetry between the strands regarding C→T and G→A ti is observed, as it is influenced by replication asymmetry and expression. We further show that gene expression significantly influences mutation types, with T→C ti being enriched in the highly expressed genes and A→G ti being enriched in the lowly expressed genes. Notably, A→T ti is strand independent but exhibits mild dependency on gene expression, revealing a previously unrecognized mutational pattern. These findings highlight how some SNPs are influenced by transcription but not replication