Overinflation and overconcentration: why Cauchy perturbation kernels are the right choice for ABC-SMC

Marc Sturrock
Vahid Shahrezaei

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Approximate Bayesian computation sequential Monte Carlo (ABC-SMC) propagates its particles with a perturbation kernel, and with the standard Normal kernel it degrades sharply as the parameter dimension grows, a failure usually attributed to dimension itself. We show instead that it is governed by the quality of the summary statistics, with dimension entering only through a separate and milder mechanism, and that the two must act together for the Normal kernel to break. The first ingredient is covariance overinflation: the kernel covariance, estimated from the particle cloud, overshoots the true posterior covariance by a factor set by information loss in the summary statistics. We derive this overscaling factor in closed form for a Gaussian model with sufficient statistics and show that it stays modest at any dimension, shrinking toward its baseline value as the tolerance tightens; the extreme values seen in practice (of order 10 ³ ) are a signature of insufficient summaries, not of dimension. The second ingredient is perturbation overconcentration: the normalised Normal step size concentrates around one as the dimension grows, so every proposal overshoots by the same factor. Either ingredient alone is harmless; only their combination breaks the Normal kernel. A Cauchy kernel (multivariate t with one degree of freedom) removes the concentration, keeping a positive acceptance rate under arbitrary overscaling at a bounded worst-case cost of 1.87× in expected squared jump distance. In a Metropolis–Hastings framework we derive closed-form acceptance rates for both kernels that illustrate the advantage of the Cauchy kernel in this limit. A series of full ABC-SMC computational experiments on five problems at d = 12, including a hierarchical gene-expression model, show the Cauchy reducing the sliced Wasserstein distance to the reference posterior by factors of up to 50 with the same simulation budget. Since the summary statistics are commonly insufficient for the models that require ABC, overinflation is structural and the Cauchy perturbation kernel is the right default for problems in higher dimensions.

Version published to 10.64898/2026.06.24.734205 on bioRxiv
Jun 25, 2026

Categorical Bayes Filtering for Computational Phenotyping in Adaptive Learning

This article has 2 authors:
1. Junxi Chen
2. Payam Piray
This article has no evaluationsLatest version May 18, 2026
Physics-Informed Neural Networks for Parameter Recovery in the Repressilator Oscillatory Model

This article has 5 authors:
1. Bernat Casajuana
2. Roger Casals-Franch
3. Adrián López García de Lomana
4. Pere Martí-Puig
5. Jordi Villà-Freixa
This article has no evaluationsLatest version May 15, 2026
Efficient Bayesian inference for ordinary differential equation models from experimental data with uncertain measurement times

This article has 4 authors:
1. Jakob Vanhoefer
2. Vanessa Nakonecnij
3. Nadine Binder
4. Jan Hasenauer
This article has no evaluationsLatest version May 13, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Categorical Bayes Filtering for Computational Phenotyping in Adaptive Learning

Physics-Informed Neural Networks for Parameter Recovery in the Repressilator Oscillatory Model

Efficient Bayesian inference for ordinary differential equation models from experimental data with uncertain measurement times