Functional Diversification of Gene Duplicates under the Constraint of Protein Structure
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (Arcadia Science)
Abstract
Gene differentiation following duplication plays a crucial role in evolution, driving the emergence of new functional genes. This process involves changes in key gene traits, such as protein structure, expression patterns, subcellular localization, and enzymatic activity, which together contribute to the development of novel functions. Here, we identified a group of homologous Glycoside Hydrolase Family 50 (GH50) agarases in the deep-sea bacterium Agarivorans ablus strain JK6, providing an ideal model for studying gene differentiation following duplication. Phylogenetic analysis revealed that these enzymes arose through gene duplication and subsequent divergence. Experimental assays demonstrated that, while they retained similar glycoside hydrolase activity, their agarolytic activity diverged significantly. We further explored their structural variations constrained by the protein’s 3D structural limitations, the development of specific localization linked to changes in enzymatic activity, and distinct expression patterns induced by different sugars. Notably, structural variations were primarily concentrated in the active site, while the overall backbone remained highly conserved. This study highlights gene differentiation following duplication as a key evolutionary strategy, facilitating the transition from single enzymes to complex functional systems.
Article activity feed
-
The calculated Ka/Ks values are much less than 0.5, indicating that the overall sequences are under significant purifying selection.
This is the first time you mention that you've calculated Ka/Ks - I would suggest first describing these analyses and the motivation behind them.
-
(A)
What are the red branches? Are the tree topologies for structure and sequence identical?
-
Considering that tandem duplications are usually highly unstable and without the action of natural selection, amplified gene arrays rapidly disappear from populations
This is a bit awkwardly phrased - maybe reword to something like "Tandem duplications are usually highly unstable and typically disappear rapidly from their natural populations in the absence of natural selection acting to maintain them"
-
Through phylogenetic tree analysis
Given that all methods are in the supplement, I think it would be best if you specify the exact methods you're using here - are these trees produced as part of the base OrthoFinder workflow? How did you infer multiple sequence alignments?
-
analyze the complete protein sequence files of more than 600 bacterial strains
Were these protein strain datasets pre-processed to remove any potential artificial gene duplicates. That is, identifying/removing highly redundant sequences that arise due to bioinformatic reasons, rather than actual gene duplication events? Inferences of orthogroups using tools like OrthoFinder and subsequent downstream analyses are highly impacted by the persistence of such artifacts.
-