TSdb: A Curated Database for Terpene Synthases and Their Application in Natural Product Mining
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Terpenoids constitute nature's most chemically diverse metabolite family with vital pharmaceutical and industrial applications, yet existing databases lack systematic integration of precursor metabolic enzymes (HMGR, DXS) and mechanistic insights into terpene diversification. To bridge this gap, we developed the Terpene Synthase Database (TSDB), distinguishing itself through three key innovations: (1) comprehensive integration of MVA/MEP pathway enzymes with downstream terpenoid synthases, (2) enhanced functional annotation via InterProScan domain mapping and phylogenetics to decode catalytic plasticity, and (3) unprecedented taxonomic breadth spanning 456,142 non-redundant sequences across 30,491 taxa. By consolidating data from BRENDA, UniProt, TeroKit, and ocean gene clusters through rigorous BLASTp deduplication (95% identity cutoff), TSDB reveals 3,499 Gene Ontology terms highlighting core functions like isoprenoid biosynthesis (GO:0019288) and metalloenzyme catalysis. Validation against MIBiG gene clusters (e.g., BGC0001324) demonstrates precise identification of terpene cyclases, P450 monooxygenases, and prenyltransferases with residue-level active site annotations. As the first resource connecting precursor metabolism to structural diversity, TSDB enables accurate gene-enzyme-product prediction for enzyme engineering and natural product discovery.