Accelerating Natural Product Discovery with Linked MS-Genomics and Language/Transformer-Based Models

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

An integrated multi-modal characterization of a microbial strain library streamlines the effort for natural product discovery. By integrating language- and transformer-based models to cross-validate mass spectrometry (MS)-genome datasets, microbial producers of diverse natural products are rapidly identified with high (75-100%) precision. Our findings demonstrate the transformative potential of linked MS-genome datasets at the strain-level to significantly accelerate discovery and enhance our understanding of microbes beyond currently known and curated knowledge.

Article activity feed