A unified language model bridging de novo and fragment-based 3D molecule design delivers potent CBL-B inhibitors for cancer treatment
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The rational design of small molecules is central to drug discovery, yet current artificial intelligence (AI) methodologies for generating three-dimensional (3D) molecules are often siloed, focusing on either de novo design or fragment-based design. The lack of a holistic framework limits AI’s application across the complex and multi-step pipeline spanning from novel scaffold identification to lead compound optimization, and prevents AI from effectively learning from the entire process. Here, we introduce UniLingo3DMol, a language model for 3D molecular generation, empowered by fragment permutation-capable molecular representation alongside multi-stage and multi-task training strategy. This integrated design enables UniLingo3DMol to seamlessly span both de novo and fragment-retained molecular design, demonstrating superior performance over existing generation models in in silico evaluations across more than 100 diverse biological targets. We further leveraged UniLingo3DMol in the design of inhibitors targeting CBL-B, a crucial immune E3 ubiquitin ligase and attractive immunotherapy target. This strategy led to a lead compound demonstrating excellent in vitro activity and robust in vivo anti-tumor efficacy. Our findings establish UniLingo3DMol as a generalized and powerful platform, showing the strong potential to advance AI-driven drug discovery.