Robust error-minimization in the genetic code across physicochemical metrics and variant codes: a graph-theoretic analysis in GF(2) ⁶

Paul Clayworth
Sergey Kornilov

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The standard genetic code reduces the impact of point mutations, but the robust-ness of this property across physicochemical metrics, naturally variant codes, and codon-reassignment mechanisms remains incompletely quantified. We embed the 64 codons in GF(2) ⁶ , representing the hypercube Q ₆ as a coordinate-dependent subgraph of the encoding-independent single-nucleotide mutation graph H (3, 4), which supports continuous ρ -interpolation between the two and enables joint analysis of physicochemical error minimization and codon-family topology. Under a block-preserving null ( n =10,000), the standard code is significantly low-cost across four distinct amino-acid distance metrics (Grantham p = 0.006; Miyata p < 0.001; Woese polar requirement p = 0.003; Kyte–Doolittle hydropathy p = 0.001), addressing the concern that prior optimality results could be metric-specific; the signal strengthens monotonically as ρ moves Q ₆ → H (3, 4). Across the 27 NCBI translation tables, near-optimality is broadly preserved: 11 of the 12 infor-mative-distance variants retain top-5% placement after BH–FDR correction (yeast mitochondrial is the sole marginal exception). Natural codon reassignments rarely break codon-family connectivity: under H (3, 4), only 6 of 28 observed events are topology-breaking versus 66% of 1,280 candidate moves (RR 0.32, permutation p ≤ 10 ⁻⁴ ). This depletion is robust to alternative topology definitions, clade exclusions, and base-to-bit encodings, although the small breaker subset (4 of 6 from a single yeast-mitochondrial lineage; denominator effect in clade-exclusion robustness) is underpowered for strong cross-clade inference. Conditional-logit decomposition shows that topology avoidance and local physicochemical cost provide complementary, only weakly correlated signal ( r _s = 0.15); a heuristic tRNA-distance proxy does not improve fit, and several variant-code lineages show suggestive tRNA-gene enrichment for reassigned amino acids. Retrospective reanalysis of nine genome-recoding datasets is consistent with—but does not establish—a working hypothesis in which codon-family topology operates at a different biological layer from acute cellular fitness: Syn61 tolerated 18,218 boundary-crossing serine swaps as a class (genome-wide, not per-codon-position viability), yet the same move type is 3.1-fold depleted across natural code evolution. The contribution is the second axis: code evolution is jointly constrained by physicochemical smoothness and codon-family topological integrity, and these two constraints are partly independent.

Highlights

Codon-space geometry links genetic-code robustness and reassignment paths
Standard and variant codes preserve broad physicochemical error minimization
Reassignments are depleted for codon-family topology-breaking moves
Conditional-logit models separate topology from physicochemical similarity
Synthetic recoding shows boundary conditions for natural-code constraints

Version published to 10.64898/2026.04.25.720843 on bioRxiv
Apr 29, 2026

Ordered Gromov-Hausdorff Metric: A New Tool for Comparative Analysis of Protein Structures

This article has 2 authors:
1. Andrey V. Timofeev
2. Alexey S. Anufriev
This article has no evaluationsLatest version May 27, 2026
Informational blueprints reveal condition-dependent gene regulatory architectures

This article has 7 authors:
1. Doruk Efe Gökmen
2. Rosalind Wenshan Pan
3. Tom Röschinger
4. Stephen Quake
5. Hernan G Garcia
6. Rob Phillips
7. Vincenzo Vitelli
This article has no evaluationsLatest version May 20, 2026
vMUS-dBG: A Novel De Bruijn Graph Model for De Novo Genome Assembly Using Variable-Length Minimum Unique Substrings

This article has 4 authors:
1. Andrews Frimpong Adu
2. Elliot Sarpong Menkah
3. Peter Amoako-Yirenkyi
4. Samson Pandam Salifu
This article has no evaluationsLatest version Apr 27, 2026

Discuss this preprint

Listed in

Abstract

Highlights

Article activity feed

Related articles

Ordered Gromov-Hausdorff Metric: A New Tool for Comparative Analysis of Protein Structures

Informational blueprints reveal condition-dependent gene regulatory architectures

vMUS-dBG: A Novel De Bruijn Graph Model for De Novo Genome Assembly Using Variable-Length Minimum Unique Substrings