Computational prioritization of multi-target inhibitors: explainable QSAR and docking-based discovery of dual AChE/BACE1 chemotypes
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The discovery of dual acetylcholinesterase (AChE) and β-secretase (BACE1) inhibitors remains a promising strategy against multifactorial Alzheimer’s disease. Here, rigorously curated ChEMBL-derived data were used to develop explainable QSAR (Quantitative structure–activity relationship) models for dual-inhibition prioritization. Molecules were standardized, near-duplicates were removed using a Tanimoto similarity threshold (≥ 0.80), and physicochemical outliers were filtered prior to modeling. Multiple classifiers (including Light Gradient-Boosting Machine, eXtreme Gradient Boosting, Random Forest, Support Vector Machine, k-Nearest Neighbors and Gradient Boosting Decision Trees) and fingerprints (e.g., RDKit fingerprints, Extended Connectivity Fingerprint) were benchmarked under scaffold-based nested cross-validation to prevent data leakage. Class imbalance was handled with SMOTETomek applied strictly within training folds. Model selection relied on F-Score, Area Under the Precision–Recall Curve, Matthews Correlation Coefficient (MCC), and Recall, and performance was accompanied by bootstrap confidence intervals, calibration curves, and Y-randomization controls. In classification, the top model (GBDT + ECFP6) achieved strong generalization (Recall ≈ 1.00, PR-AUC ≈ 0.84, MCC ≈ 0.81, F1 Score ≈ 0.84). Shapley Additive Explanations (SHAP) analysis highlighted aromatic and hydrogen-bonding substructures as key positive contributors. Prospective candidates (e.g., CHEMBL5082250, CHEMBL1651126, CHEMBL1651127) were evaluated by active-site-focused docking against AChE (PDB: 4EY7) and BACE1 (PDB: 2G94) with essential waters retained; docking scores (ΔG, kcal·mol⁻ 1 ) were used for relative ranking of the ligands. SwissADME/pkCSM profiling suggested CNS-relevant properties (e.g., MPO, logBB, P-gp liability) and acceptable oral drug-likeness. Collectively, the workflow provides a reproducible and transparent pipeline for prioritizing dual AChE/BACE1 chemotypes and nominates testable scaffolds for experimental validation.