Duality of Simplicity and Accuracy in QSPR: A Machine Learning Framework for Predicting Solubility of Selected Pharmaceutical Acids in Deep Eutectic Solvents

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We present a systematic machine learning study of the solubility of diverse pharmaceutical acids in deep eutectic solvents (DESs). Using an automated Dual-Objective Optimization with Iterative feature pruning (DOO-IT) framework, we analyze a solubility dataset compiled from the literature for ten pharmaceutically important carboxylic acids and augment it with new measurements for mefenamic and niflumic acids in choline chloride- and menthol-based DESs, yielding N = 1020 data points. The data-driven multi-criterion measure is applied for final model selection among all collected accurate and parsimonious models. This three-step procedure enables extensive exploration of the model’s hyperspace and effective selection of models fulfilling notable accuracy, simplicity, and also persistency of the descriptors selected during model development. The dual-solution landscape clarifies the trade-off between complexity and cost in QSPR for DES systems and shows that physically meaningful energetic descriptors can replace or enhance explicit COSMO-RS predictions depending on the application.

Article activity feed