DTMol: Pocket-based Molecular Docking using Diffusion Transformers

Haotian Teng
Ran Wang
Yihang Shen
Ye Yuan
Carl Kingsford

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In computational chemistry, molecular docking—predicting the binding structure of a small molecule ligand to a protein—is vital for understanding interactions between small molecules and their protein targets, with broad applications in drug discovery (Morris and Lim-Wilby, 2008). Traditional docking methods rely on energy-based scoring functions and optimization algorithms to identify ligand-binding structures. However, these methods are often slow and inaccurate due to the extensive search space for binding structures and the complexity of scoring function landscapes (Corso et al., 2023).

Deep learning techniques have emerged as promising alternatives for molecular docking. These methods harness the power of neural networks to understand the complex interactions between protein pockets and ligands from large datasets. They can be classified into two categories: regression-based methods (Stark et al., 2022; Lu et al., 2022), which offer greater computational efficiency compared to traditional docking methods but have yet to offer substantial improvements in accuracy (Corso et al., 2023); and generative models, particularly diffusion models (Corso et al., 2023; Qiao et al., 2024; Nakata et al., 2023; Lu et al., 2024; Schneuing et al., 2024), which lead to significant accuracy improvements compared to regression-based methods.

Most diffusion-based docking methods aim to address the blind docking problem, where the ligand is docked without prior knowledge of the specific protein pocket—a site of the protein with potential ligand-binding capabilities—and only the overall protein structure is provided (Yim et al., 2024). However, in many drug discovery cases, specific pockets have already been identified (Zheng et al., 2013), meaning that the docking process only needs to determine how the ligand binds to the pocket. This defines the pocket-based docking problem. Pocket-based docking leverages the physical and chemical information of protein side chains around the pocket, significantly reducing the search space for ligand binding structures and improving the understanding of atomic-level interactions. Additionally, focusing on the local protein substructure rather than the global structure enhances computational efficiency, which allows for the use of more complex neural architectures.

We introduce DTMol, a novel diffusion model designed to tackle the pocket-based docking problem. Our model integrates a pretrained molecular representation framework with a diffusion transformer architecture. The advantages of this design are twofold: first, the molecular representation models we employ are pretrained on separate datasets of small molecules and protein pockets, which are larger than the datasets of ligand-pocket interactions. Consequently, these pretrained models encode typical structural features of both elements, enhancing our model’s ability to represent them and predict binding structures. Second, the transformer architecture used in our diffusion model is more capable of effectively capturing all-atom interactions between the pocket and the ligand.

We test our method on the docking benchmark PDBBind v2020 and PoseBuster, and compare it with other methods. DTMol achieves 77.65% top-1 success rate (RMSD ¡ 2Å) on PoseBuster, outperforms the second-best docking software Gnina (65.65%) We also test the performance of our method on a real-world virtual screening task using Janus kinase 2 (PDB ID: 6BBV) as the target protein. A TR-FRET screening experiment is performed on the union of top ligands identified via physics-based docking and the diffusion model, in order to measure their inhibitory activity against the JAK2 protein. The results shows that our model achieved the only positive rank correlation score compared to other traditional docking methods and machine learning-based methods.

To summarize, the main contributions of this work are:

We introduce DTMol, a novel diffusion model designed to address the pocket-based docking problem. This model represents the first application of the diffusion transformer architecture in solving pocket-based molecular docking.

We propose a novel SE(3)-equivariant transformer architecture with an SE(3)-invariant and a parallel SE(3)-equivariant flow, which is straightforward to implement and train while achieving excellent performance.

We achieve new state-of-the-art results on both the PDBBind and PoseBuster pocket-based docking benchmarks, obtaining top-1 prediction rates of 45.05% and 77.65%, respectively, at RMSD < 2Å.

Doing virtual screening on Janus kinase 2, DTMol is able to achieve the only positive rank score; and identifies 2 new experimentally effective JAK2 inhibitors.

Version published to 10.1101/2025.04.13.648103 on bioRxiv
Apr 18, 2025

Rapid Assessment of Chemical Complementarity of Ligands for Protein Design

This article has 9 authors:
1. Derek Woolfson
2. Rokas Petrenas
3. Katarzyna Ożga
4. Joel Chubb
5. Andrey Romanyuk
6. Jennifer McManus
7. Graham Leggett
8. Nigel Scrutton
9. Tom Oliver
This article has no evaluationsLatest version Dec 10, 2025
Nuclear-Charge-Guided Mamba with KAN Dynamic Mixture for Molecular Property Prediction

This article has 1 author:
1. Hong Wang
This article has no evaluationsLatest version Dec 30, 2025
Drug discovery guided by maximum drug likeness

This article has 3 authors:
1. Hao-Yu Zhu
2. Lu Xu
3. Wei Shi
This article has no evaluationsLatest version Dec 31, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Rapid Assessment of Chemical Complementarity of Ligands for Protein Design

Nuclear-Charge-Guided Mamba with KAN Dynamic Mixture for Molecular Property Prediction

Drug discovery guided by maximum drug likeness