SHARP: Generating Synthesizable Molecules via Fragment-based Hierarchical Action-space Reinforcement Learning for Pareto Optimization

Jeonghyeon Kim
Seongok Ryu
Hahnbeom Park
Chaok Seok

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Designing drug-like molecules that satisfy multiple objectives—such as high binding affinity, synthesizability, and drug-likeness—poses a complex global optimization problem over an astronomically large chemical space. Existing deep learning-based molecular generative models often treat this task as distribution modeling, relying on atom-level autoregressive actions with less consideration of explicit optimization feedback. Consequently, they frequently generate invalid structures, converge to local optima, or produce synthetically infeasible candidates. Here, we introduce SHARP (Synthesizable Hierarchical Action-space Reinforcement learning for Pareto optimization), a molecular generator that addresses these limitations via a fragment-based hierarchical action space and reinforcement learning. SHARP ensures synthetic accessibility by applying action masks guided by a pretrained Synthesizability Estimation Model (SEM). The reinforcement learning (RL) policy is trained using a composite reward function integrating docking scores, pharmacophore matching, and solvent accessibility to generate functionally relevant and experimentally tractable molecules. Furthermore, across four lead optimization tasks—fragment growing, linker design, scaffold hopping, and sidechain decoration—on a diverse receptor set, SHARP consistently outperforms prior methods in producing molecules at high affinity and synthesizability. These results demonstrate that reinforcement learning with a chemically intuitive action space design can be an effective solution to the optimization challenges in AI-driven drug discovery, offering a robust framework for rational molecular design in structure-based applications.

Version published to 10.1101/2025.07.18.665529 on bioRxiv
Jul 23, 2025

Nuclear-Charge-Guided Mamba with KAN Dynamic Mixture for Molecular Property Prediction

This article has 1 author:
1. Hong Wang
This article has no evaluationsLatest version Dec 30, 2025
Drug discovery guided by maximum drug likeness

This article has 3 authors:
1. Hao-Yu Zhu
2. Lu Xu
3. Wei Shi
This article has no evaluationsLatest version Dec 31, 2025
Integrating Evolutionary and Compositional Features with ML and DL for Robust and Interpretable Druggable Protein Prediction

This article has 5 authors:
1. Mujeebu Rehman
2. Qinghua Liu
3. Muhammad Javed
4. Ali Ghulam
5. Teerath Kumar
This article has no evaluationsLatest version Dec 11, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Nuclear-Charge-Guided Mamba with KAN Dynamic Mixture for Molecular Property Prediction

Drug discovery guided by maximum drug likeness

Integrating Evolutionary and Compositional Features with ML and DL for Robust and Interpretable Druggable Protein Prediction