Blind Virtual Screening at Scale: A Scalable End-to-End Pipeline for Blind Docking and Affinity Prediction

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Accurate and scalable prediction of protein–ligand interactions remains a central challenge in computational drug discovery, especially when the binding site is unknown (i.e., blind docking). We present a high-throughput, end-to-end algorithm for virtual screening that combines DiffDock, a diffusion-based generative model for blind docking, with UniDock Vina, an algorithm for rapid scoring. We benchmarked this approach on the CASF-2016 and DUD-E datasets, analyzing pose quality, scoring accuracy, and screening performance. We find that competitive screening power can be achieved when generating and scoring as few as three poses and without pose refinement, which facilitates scalability. Notably, our method achieves 86.78% and 82.00% for the percent of actives among the top 1% and 10% of ranked ligands, respectively, when generating as few as three poses per protein-ligand pair. The workflow is scalable, supporting blind docking and affinity prediction at a mean throughput of 0.76 seconds per protein-ligand pair when generating 40 ligand poses in batched mode parallelized to 8 NVIDIA A100 80G GPUs. These results demonstrate that accurate, large-scale blind virtual screening is feasible and offers a practical solution for screening against novel or less characterized protein targets. Code is available at: https://github.com/xinyu-dev/blind-screening-benchmark

Article activity feed