Bayesian optimization of solution-processed thermoelectric polymers using a database of ab initio electronic structure data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Historically, organic materials design takes place wherein the components in the system are chosen based on expert knowledge of organic synthetic chemistry. Experimental synthesis and processing, often followed by computational validation, is a standard route to design materials such as polymers. The advent of data-driven machine learning approach can now shift that paradigm to one in which computational information \textit{leads} experimental validation to select a much smaller, more promising and perhaps unexpected set of new materials. This work describes one such investigation to characterize novel, hypothetical, and promising polymer candidates by their calculated molecular-scale electronic and chemical parameters. Subsequently, we optimize them for a key chemical property using a chemically-informed Bayesian surrogate model representing those properties. As a test case, over 7300 combinations of novel, hypothetical semiconducting diketopyrrolopyrrole-based (DPP) polymers and commonly used solvents, generated by density functional theory, were screened based on their free energies of solvation, G$_{solv}$. From this synthetic data set, we trained a physics-informed Gaussian process model that linked molecular-scale electronic structure properties to G$_{solv}$, and then used Bayesian optimization via the PAL 2.0 software to identify key descriptors for predicting optimal polymer solvation. Implicit and explicit solvation models of these synthetic polymers, combined with Bayesian optimization, exhibited a minimum in $\Delta$G$_{solv}$ as a function of the solvent dielectric. As a result, we predict an ''optimal'' solvent dielectric (around 10) for the entire DPP-based polymer class. The chemically-informed nature of our approach allowed us to identify a molecular electronic property that demonstrates near-linear additivity when constructing an aggregate synthetic polymer from functional group sub-units. This linearly related property was the isotropic quadrupole moment, which we found to be highly correlated to $\Delta$G$_{solv}$. As validation of this chemical insight, we showed that this minimum $\Delta$G$_{solv}$ was associated with the highest experimentally measured conductivity among a set of synthesized doped polymers. The importance of these observations is that PAL 2.0 has provided the chemical insight that can now allow us to quickly screen solvents for potential DPP-based polymers based on the polymer repeat unit's quadrupole moments and to identify preferred compatible solvents corresponding to a minimized $\Delta$G$_{solv}$. This study also highlights the effectiveness of the PAL 2.0 algorithm to find the optimal solution using just 6\% of the parameter space and in a very short execution time (on the order of minutes), a feat which cannot be duplicated experimentally.

Article activity feed