Multi-Scale Protein Language Model for Unified Molecular Modeling

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Protein language models have demonstrated significant potential in the field of protein engineering. However, current protein language models primarily operate at the residue scale, which limits their ability to provide information at the atom level. This limitation prevents us from fully exploiting the capabilities of protein language models for applications involving both proteins and small molecules. In this paper, we propose ms -ESM (multi-scale ESM), a novel approach that enables multi-scale unified molecular modeling. ms -ESM achieves this by pre-training on multiscale code-switch protein sequences and utilizing a multi-scale position encoding to capture relationships among residues and atoms. Experimental results indicate that ms -ESM surpasses previous methods in protein-molecule tasks, demonstrating the full utilization of protein language models. Further investigations reveal that through unified molecular modeling, ms -ESM not only gains molecular knowledge but also retains its understanding of proteins.

Article activity feed