BiG-SCAPE 2.0 and BiG-SLiCE 2.0: scalable, accurate and interactive sequence clustering of metabolic gene clusters
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Microbial metabolic gene clusters encode the biosynthesis or catabolism of metabolites that facilitate ecological specialization, mediate microbiome interactions and constitute a major source of medicines and crop protection agents.
Here, we present BiG-SCAPE and BiG-SLiCE 2.0, next-generation methods that facilitate scalable, accurate and interactive gene cluster analyses. BiG-SCAPE 2.0 updates its classification, alignment methods, and visualizations, enabling more accurate analysis, up to 8x faster runtimes and halved memory requirements. BiG-SLiCE 2.0 updates its distance metric, pHMM database, and classification logic, resulting in increased sensitivity nearing that of BiG-SCAPE.
Analysis of 260,630 biosynthetic gene clusters from publicly available genomes reveals that both tools generate concurring estimates of gene cluster diversity, thus providing significantly extended methodological support for recent evidence indicating that the vast majority of natural product diversity remains unexplored.
Together, these updates will facilitate global genome mining efforts for natural product discovery and microbiome analyses scalable with current data sizes.