Bioinformatic Mining of Novel Lipopeptides Enabled by Dap-tomycin Cs Domain and Structural Modeling
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Amidst the escalating crisis of antibiotic resistance, lipopeptides have emerged as promising therapeutic candidates due to their unique amphipathic structures. In this study, we developed a systematic bioinformatic platform using the condensation starter (Cs) domain of daptomycin as a molecular probe. Sequence similarity network (SSN) analysis identified 613 potential lipopeptide biosynthetic gene clusters (BGCs), with 432 (70.5%) originating from Streptomyces species. Subsequent integration of antiSMASH boundary prediction and evolutionary analysis prioritized 33 candidate BGCs harboring multiple post-modification modules. Five novel BGC types were ultimately selected based on Cs domain homology (<40% identity) and modification complexity. AlphaFold3 modeling revealed that WP_386473946.1 possess distinctive loop architecture, an expanded catalytic cavity, and a unique Asp60-Ala90 structural unit. Crucially, glycine residues adjacent to the conserved HHxxDG motif in the active pocket provide key targets for substrate recognition and rational engineering. This work delivers structure-guided genomic resources and molecular blueprints for accelerated discovery of antidrug-resistant lipopeptides.