Localised Graph Neural Networks for Aqueous Solubility Prediction: A New Paradigm in QSPR Modelling

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Predicting aqueous solubility remains a key challenge in drug discovery due to its importance in absorption, distribution, metabolism, and elimination (ADME) properties. Recent advances in machine learning, particularly graph neural networks (GNNs), have set new benchmarks in quantitative structure--property relationship (QSPR) modelling. Existing methods, however, focus almost exclusively on global models that attempt to generalise across large chemical spaces. In this paper, we introduce a novel paradigm: \emph{localised} GNN models trained on structurally similar molecules. We demonstrate that this approach outperforms state-of-the-art benchmarks on the AqSolDB dataset, achieving a root mean squared error (RMSE) of 0.903 compared to 1.459 for SolTranNet. We further provide the first large-scale quantitative study of the relationship between Tanimoto similarity and solubility difference, supporting the intuition that localised models can capture fine-grained structure--property dependencies relevant to iterative drug design. Our results establish localisation as a nontrivial and promising complement to global QSPR approaches.

Article activity feed