Bayesian Inference of Expensive Simulators: Survey and Comparison of Design Choices
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We present a survey of approaches to Bayesian inference with simulator forward models, focusing on settings with prohibitively expensive simulators. In this domain, Bayesian optimization-based methods have emerged as the most promising approach. We unify existing methods into a general framework, enabling consistent reasoning about shared algorithmic structure across the fragmented literature. Within this framework, we identify the primary design choices and perform a systematic empirical comparison of their effects on inference efficiency in terms of required simulation budget. A key result of our study is that inference efficiency is often dominated by the choice of surrogate-model response variable, as it directly governs the complexity of the modeled function. Specifically, we demonstrate that the prevalent scalar log-likelihood is a poor response variable, as it discards crucial information. Based on this insight, we introduce the concept of custom \emph{proxy variables}: transformations of simulator outputs engineered to retain more information and provide smoother response surfaces, substantially improving inference efficiency over standard response variables. We provide actionable guidance and a numerically robust Julia package to facilitate practical adoption of Bayesian optimization for simulator inverse problems.