Occupancies vs Embeddings: Internal representations of Ab Initio and Graph Probabilistic Models*
Discuss this preprint
Start a discussionListed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Chemical modeling has traditionally been separated into two fairly non-overlapping frameworks: the exact framework of \textit{ab initio} models and the data-driven framework of statistical models. However, recent advances in machine learning approaches have intercepted the exact \textit{ab initio} frameworks by using an \textit{end-to-end} probabilistic model of the Schr\"odinger equation in the Born-Oppenheimer approximation---from raw nuclear configurations to quantum observables such as the energy. While probabilistic modelling offers promising advancements to speed up computational predictions, the internal representation and its connection to the underlying quantum data---the wave function or electron density---which they implicitly rely on for training, remains relatively poorly understood. This study seeks to make a first comparison of the internal representation from a probabilistic end-to-end models with those from pure \textit{ab initio} frameworks such as density functional theory. We do this by comparing the internal atomwise representation of graph models, also called embeddings, to their most natural \textit{ab initio} counterpart, electron occupancy values. Our findings show that the embedding representations can be employed to transfer learn atomic occupancy values whereas the reverse mapping appears to be less accurate. We discuss the assumption that the probabilistic model could infer the existence of an underlying electron density underpinning the computational \textit{ab-initio} data.