High-throughput antioxidant screening of flavan-3-ols via active learning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Flavan-3-ols, a subclass of dietary flavonoids bearing multiple hydroxyl (-OH) groups, are well-recognized for their antioxidant potency, yet systematic exploration of their derivatives with a huge diversity of the number and position of -OH substituents remains a computational challenge. Here, we introduce an integrated computational strategy that combines density functional theory (DFT) with graph neural network-based machine learning (ML) to predict three key thermodynamic descriptors governing antioxidant activity: ionization potential (IP), O-H bond dissociation enthalpy (BDE), and proton affinity (PA). Starting from a virtual library of more than 5600 flavan-3-ol variants, we used DFT to evaluate a subset of 1235 compounds and then trained ML models to guide further DFT calculations toward the most promising candidates. The final ML models achieved high predictive accuracy, particularly for molecular-wide properties representing the most reactive site (MAEs: 1.02 kcal/mol for IP, 0.68 kcal/mol for BDE mol , and 0.79 kcal/mol for PA mol ). Our thermodynamic analysis confirms that aromatic hydroxylation enhances antioxidant potential and that the consistent ordering of PA mol < BDE mol < IP supports the sequential proton loss electron transfer (SPLET) pathway as the predominant mechanism in aqueous media. Structure-antioxidant activity relationship (SAR) mapping further identifies C6 and C8 positions on the flavan-3-ol core and the C2’/C6’ positions on the pendant phenyl ring as dominant reactive sites. This work establishes a scalable computational framework that not only accurately predicts antioxidant properties but also provides deep mechanistic and structural insights for accelerating the discovery of novel flavonoid-based antioxidants. Scientific Contribution This work introduces the first large-scale predictive framework for the complex flavan-3-ol chemical space, a class of antioxidants for which comprehensive, multi-property models have been previously unavailable. Our approach enables the development of high-fidelity GNN models for a holistic set of molecular-wide antioxidant descriptors (IP, BDEmol, PAmol), achieving excellent predictive accuracy (MAEs < 1.1 kcal/mol for all properties). The resulting models were used to discover novel, synthetically feasible antioxidant candidates and have been deployed as a freely available web application for community use at: https://antioxidant-predict.streamlit.app

Article activity feed