Choosing informative priors in Bayesian regression models. A simulation study and tutorial using Stan and R

Daniel Lüdecke
Anna Makowski
Jens Klein
Dominique Makowski

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background Bayesian regression models provide a robust framework for complex data analysis, proving particularly advantageous in scenarios with small sample sizes common in medical research. However, specifying appropriate prior distributions, which incorporate existing knowledge to regularize model parameters, remains a challenge for many researchers. This can lead to unstable or implausible estimates. This study aims to demonstrate the impact of different prior distributions on regression models and provide a practical guide for choosing and justifying informative priors to produce more stable and credible results. Methods The study involved two parts. First, a simulation study was conducted to systematically assess the sensitivity of Bayesian linear regression models to prior specification. We systematically varied sample size, prior location, and prior scale to observe the impact on posterior estimates for a known true effect size. Second, a case-control study using real-world patient data (N = 526) demonstrated the practical application of choosing informative priors. Bayesian logistic regression models were used to analyse the relationship between severe dementia and fall incidence, comparing results from priors based on existing literature (“believer”), conservative priors (“agnostic”), and priors assuming an opposite effect (“sceptical”). Results The simulation study showed that strongly informative priors had a substantial influence on posterior estimates, particularly at smaller sample sizes. As the sample size increased, the influence of the data grew, and the estimates converged toward the true effect. In the case-control study, a standard frequentist analysis produced an odds ratio of 8.87 with a very wide and unstable confidence interval (1.66–165.19). In contrast, a Bayesian model using a moderately informative “believer” prior derived from existing research yielded a more plausible odds ratio of 4.40 with a substantially narrower and more precise credible interval (1.82–12.54). Conclusions The careful and transparent specification of informative priors is a critical tool in Bayesian analysis, especially when data are sparse. By incorporating justified, evidence-based assumptions, researchers can regularize models to prevent implausible outcomes and produce more stable, interpretable, and credible results. This approach enhances the robustness of statistical inference in fields where small sample sizes are a frequent challenge.

Version published to 10.21203/rs.3.rs-7599542/v1 on Research Square
Oct 6, 2025

Fixed-Effect or Random-Effects Models? How to Choose, Perform, and Interpret Meta-Analyses in Clinical Research

This article has 1 author:
1. Javier Arredondo Montero
This article has no evaluationsLatest version Oct 22, 2025
Choosing Right Bayesian Tools: A Comparative Study of Modern Bayesian Methods in Spatial Econometric Models

This article has 2 authors:
1. Yuheng Ling
2. Julie Le Gallo
This article has no evaluationsLatest version Oct 13, 2025
Double Robust, Flexible Adjustment Methods for Causal Inference: An Overview and an Evaluation

This article has 1 author:
1. Nathan Isaac Hoffmann
This article has no evaluationsLatest version Sep 30, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Fixed-Effect or Random-Effects Models? How to Choose, Perform, and Interpret Meta-Analyses in Clinical Research

Choosing Right Bayesian Tools: A Comparative Study of Modern Bayesian Methods in Spatial Econometric Models

Double Robust, Flexible Adjustment Methods for Causal Inference: An Overview and an Evaluation