High-throughput antioxidant screening of flavan-3-ols via active learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Flavan-3-ols, a subclass of dietary flavonoids bearing multiple hydroxyl (-OH) groups, are well-recognized for their antioxidant potency, yet systematic exploration of their derivatives with a huge diversity of the number and position of -OH substituents remains a computational challenge. Here, we introduce an integrated computational strategy that combines density functional theory (DFT) with graph neural network-based machine learning (ML) to predict three key thermodynamic descriptors governing antioxidant activity: ionization potential (IP), O-H bond dissociation enthalpy (BDE), and proton affinity (PA). Starting from a virtual library of more than 5600 flavan-3-ol variants, we used DFT to evaluate a subset of 1235 compounds and then trained ML models to guide further DFT calculations toward the most promising candidates. The final ML models achieved high predictive accuracy, particularly for molecular-wide properties representing the most reactive site (MAEs: 1.02 kcal/mol for IP, 0.68 kcal/mol for BDE mol , and 0.79 kcal/mol for PA mol ). Our thermodynamic analysis confirms that aromatic hydroxylation enhances antioxidant potential and that the consistent ordering of PA mol < BDE mol < IP supports the sequential proton loss electron transfer (SPLET) pathway as the predominant mechanism in aqueous media. Structure-antioxidant activity relationship (SAR) mapping further identifies C6 and C8 positions on the flavan-3-ol core and the C2’/C6’ positions on the pendant phenyl ring as dominant reactive sites. This work establishes a scalable computational framework that not only accurately predicts antioxidant properties but also provides deep mechanistic and structural insights for accelerating the discovery of novel flavonoid-based antioxidants. Scientific Contribution This work introduces the first large-scale predictive framework for the complex flavan-3-ol chemical space, a class of antioxidants for which comprehensive, multi-property models have been previously unavailable. Our approach enables the development of high-fidelity GNN models for a holistic set of molecular-wide antioxidant descriptors (IP, BDEmol, PAmol), achieving excellent predictive accuracy (MAEs < 1.1 kcal/mol for all properties). The resulting models were used to discover novel, synthetically feasible antioxidant candidates and have been deployed as a freely available web application for community use at: https://antioxidant-predict.streamlit.app