Integrating targeted genome mining and structure-guided modeling reveals unexplored 7-deazapurine-containing pathways
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
7-deazapurines are nucleoside analogs that play key roles in nucleic acid modification and can serve as building blocks for diverse, bioactive secondary metabolites. Despite their biological significance, their biosynthetic diversity, distribution, and enzymatic determinants of structural diversification remain poorly understood. Here, we leverage large-scale targeted genome mining, phylogenetic, and network analysis to explore 7-deazapurine-containing pathways across ∼2 million bacterial genomes. We identified over 900 candidate biosynthetic gene clusters (BGCs), grouped into more than 100 families, most of which remain uncharacterized. These GATOR-GC-predicted BGCs were predominantly found in Streptomyces . We then examined enzyme-substrate interactions in three representative pathways: (i) peptidyl-deazapurines, (ii) huimycin, and (iii) dapiramicin A. Molecular docking and molecular dynamics (MD) simulations recapitulated known enzyme-substrate interactions and highlighted candidate catalytic residues governing amide bond formation, methylation, and glycosylation. Using this genome- and structure-guided framework, we identified a candidate BGC for dapiramicin A and proposed tailoring steps, including scaffold methylation and deoxy-sugar formation. These findings expand the known diversity of 7-deazapurine-containing BGCs and demonstrate how integrating genome mining with structural modeling can link BGCs to chemical function, providing a foundation for discovering and characterizing 7-deazapurine-containing secondary metabolites.