Reinforcement Learning-Based Generation of EGFR-Targeted Anticancer Small Molecules
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We report a reinforcement-learning-enhanced generative chemistry pipeline for the de novo design of small-molecule inhibitors targeting Epidermal Growth Factor Receptor (EGFR). Starting from a pretrained ChemBERTa language model fine-tuned on high-affinity EGFR in- hibitors, we introduce a multicomponent reward function combining predicted potency (40%), drug-likeness (QED, 25%), synthetic accessibility (SA, 15%), and novelty relative to the train- ing library (20%). Through policy-gradient optimization over 500 iterations, the model learns to produce chemically valid, diverse, and novel scaffolds enriched for high composite rewards. Compared with the prior policy, the RL-tuned generator achieves a 20 % increase in mean re- ward and yields a threefold expansion in unique Bemis-Murcko cores. High-throughput dock- ing against the EGFR kinase domain (PDB ID: 1M17) demonstrates that the newly generated library attains a median predicted binding affinity of –9.2 kcal/mol, significantly surpassing the –8.5 kcal/mol baseline of known inhibitors. An exemplar generated ligand recapitulates key hinge-binding interactions while presenting a novel solvent-exposed substituent for further optimization. This study illustrates the power of integrating language-model pretraining with reinforcement learning and composite reward engineering to accelerate target-focused drug discovery.