Multimodal AI unveils a billion-scale microbial enzyme atlas

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Microbial enzymes constitute a vast reservoir for biocatalyst discovery, yet the functional catalog of enzymes within the microbial proteome remains largely uncharted. Here, we develop RAMER, a pretrained multimodal artificial intelligence (AI) model for enzymes, integrating protein sequences, structures and catalytic reactions. RAMER substantially improves recall for enzyme function annotation at level 4 Enzyme Commission (EC) classes. Applying RAMER to 3.08 billion proteins from marine, freshwater, terrestrial, and extreme microbiomes, we constructed an atlas of 1.04 billion enzymes assigned to over 5600 EC classes, representing nearly 30 times the number of automatically annotated enzymes in UniProtKB/TrEMBL. For microbiomes from distinct environments, enzymes account for 40% to 51.1% of the proteomes and exhibit broadly similar catalytic function profiles. Beyond EC classes, RAMER enables fine-grained organization of specific enzymes in a manner that reflects functional structural domains and conformations, further facilitating enzyme mining and exploration of natural enzyme diversity. With RAMER, we mined and validated fluorinases from Actinomycetota and PETases from the Mariana Trench microbiome, some of which possessed new structural conformations. Together, the atlas provides a comprehensive resource and a perspective of microbial enzymes, and RAMER offers an AI framework for function-driven biocatalyst discovery.

Article activity feed