An SE(3)-equivariant language model for pocket-aware 3D molecular generation enables discovery of potent HPK1 inhibitors

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Generating molecules that simultaneously achieve optimal 3D pocket-binding conformations and chemically plausible topologies remains a central challenge in AI for Structure-Based Drug Design. Graph-based models excel in SE(3)-equivariant spatial reasoning but often struggle to ensure chemical validity, whereas language models capture discrete chemical syntax yet lack 3D spatial understanding. Here we introduce SE3-BiLingoMol, an SE(3)-equivariant language model for pocket-aware 3D ligand de novo generation and fragment-guided optimization. Built upon Geometric Algebra Transformers and a fragment-aware SMILES representation, our model enables SE(3)-equivariant modeling of continuous 3D geometry while ensuring chemically valid molecular topologies. To counteract cumulative 3D conformational errors inherent to autoregressive generation, we developed a bidirectional attention-based self-refinement mechanism as a key architectural component of SE3-BiLingoMol. Our model achieves state-of-the-art performance in an in-silico evaluation across over 100 diverse targets. Critically, application of SE3-BiLingoMol led to the discovery of a novel tetracyclic HPK1 inhibitor showing potent in vitro activity and robust in vivo anti-tumor efficacy. This work demonstrates a powerful and practical generative AI framework for accelerating structure-based drug design.

Article activity feed