Prediction of the Imprinting Quality of Molecularly Imprinted Polymers via a Data-driven Similarity-based Clustering Approach
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The imprinting factor (IF) is a key metric for evaluating the imprinting performance of molecularly imprinted polymers (MIPs). However, most existing data-driven models that predict or classify IF encounter difficulties due to inconsistent datasets, target leakage, and unreliable extrapolation. In response, we develop a similarity-based IF range prediction model that combines 2D molecular descriptors and fingerprints for templates, functional monomers, crosslinkers, and porogens, along with experimental conditions and structural readouts from 201 literature-derived MIP systems. The model initially employs stability-enhanced feature selection and ranking. A weighted Gower similarity metric accompanied by thresholded neighbor voting and an explicit abstention policy to avoid extrapolation, achieved an accuracy of 84.4% on the validation set. Overall, this innovative MIP informatics tool provides reliable IF predictions across diverse MIP systems, identifies reliable design regions, and highlights areas of high uncertainty, supporting the rational design and performance comparison of MIPs.