Hydrogeological Risk Assessment of Coal Floor Water Inrush Using a Hybrid Machine Learning Model with Uncertainty Quantification
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Coal mine floor water inrush is a major geological hazard threatening the safety of underground mining operations. Under small-sample conditions, high geological uncertainty often causes traditional machine learning models to overfit and limits their ability to provide reliable risk estimates. To address these challenges, a two-stage framework is proposed that integrates robust factor identification with uncertainty-aware prediction. We integrated four complementary methods—Spearman correlation, single-factor AUC, permutation-based random forest importance, and LOOCV-stabilized logistic regression—into a consensus scoring system to select key controlling factors. To ensure stability under small-sample constraints, the importance of each factor was evaluated through a LOOCV averaging process, which effectively mitigates the influence of individual sample outliers. Subsequently, a Bayesian Logistic Regression (BLR) model was trained via LOOCV and benchmarked against Random Forest, Support Vector Machine, Logistic Regression, and Gaussian Process Classification. Results show that BLR achieves the highest composite score (0.878) and exceptional stability while uniquely providing dual uncertainty quantification through 95% credible intervals and predictive entropy. This approach not only enhances prediction reliability but also enables engineers to distinguish between model confidence and prediction ambiguity—offering a transparent, interpretable, and risk-informed solution for hydrogeological hazard assessment in data-scarce mining environments.