Enhancing Bayesian Kernel Machine Regression: A Dynamic Thresholding Framework to Address Variability and Skewness in High-Dimensional Environmental Health Data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Bayesian Kernel Machine Regression (BKMR) is widely used in environmental health research to model complex, nonlinear, and interactive relationships in high-dimensional datasets. However, using a fixed posterior inclusion probability (PIP) threshold can lead to inconsistent test size control, influenced by the coefficient of variation (CV) and sample size. This study introduces a dynamic thresholding approach that adapts to these dataset characteristics, improving the sensitivity and reliability of BKMR analyses. A four-parameter logistic regression model was developed to estimate the 95th percentile of PIP as a function of the log-transformed CV and sample size. Simulations were performed across a broad range of CV values and sample sizes to evaluate test size performance for fixed and dynamic thresholds. The dynamic threshold was validated with independent simulated datasets and applied to the 2011–2014 NHANES data. The dynamic threshold consistently maintained nominal test sizes near five percent, outperforming the fixed threshold, which exhibited substantial variability. Validation with empirical data identified cadmium, manganese, and lead as significant contributors to cognitive performance, with cadmium emerging as the most influential. The dynamic threshold approach improves the precision of variable selection in BKMR, offering a more reliable method to analyze complex exposure-response relationships in environmental health research.

Article activity feed