Rag2Mol: Structure-based drug design based on Retrieval Augmented Generation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Artificial intelligence (AI) has brought tremendous progress to drug discovery, yet identifying hit and lead compounds with optimal physicochemical and pharmacological properties remains a significant challenge. Structure-based drug design (SBDD) has emerged as a promising paradigm, but the inherent data biases and ignorance of synthetic accessibility render SBDD models disconnected from practical drug discovery. In this work, we explore two methodologies, Rag2Mol-G and Rag2Mol-R, both based on retrieval-augmented generation (RAG) to design small molecules to fit a 3D pocket. These two methods involve searching for similar small molecules that are purchasable in the database based on the generated ones, or creating new molecules from those in the database that can fit into a 3D pocket. Experimental results demonstrate that Rag2Mol methods consistently produce drug candidates with superior binding affinities and drug-likeness. We find that Rag2Mol-R provides a broader coverage of the chemical landscapes and more precise targeting capability than advanced virtual screening models. Notably, both workflows identified promising inhibitors for the challenging target PTPN2. Our highly extensible framework can integrate diverse SBDD methods, marking a significant advancement in AI-driven SBDD. The codes are available at: https://github.com/CQ-zhang-2016/Rag2Mol .