Ensemble statistical techniques for predictive analytics in high-noise environments: A novel noise-weighted stacking framework
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Predictive analytics in real-world deployments routinely encounter data corrupted by measurement noise, sensor drift, and environmental interference—conditions collectively termed high-noise environments (HNE). The discriminative performance of classical single-classifier approaches substantially decreases as the signal-to-noise ratio (SNR) decreases below 15 dB, highlighting the need for robust ensemble strategies. This paper introduces the entropy-weighted noise-aware ensemble (EWNE-Stack), a novel two-level stacking meta-learner in which base classifier weights are dynamically adjusted at inference time using an information-theoretic noise-level estimator. The estimator computes the Shannon entropy over rolling prediction windows and inverts it to assign higher confidence weights to classifiers that exhibit lower local prediction uncertainty. The proposed EWNE-Stack achieves a classification accuracy of 96.2% at an SNR = 5 dB and an area under the ROC curve (AUC) of 0.962, representing statistically significant improvements of 2.8%, 2.8%, 3.9%, 9.9%, and 16.4 percentage points over gradient boosting, random forest, SVM, logistic regression, and naïve Bayes, respectively. Extensive experiments on three benchmark datasets—the UCI HEPMASS particle physics dataset, the PhysioNet MIT-BIH Arrhythmia dataset, and a synthetically generated high-noise tabular dataset—confirm the reproducibility and generalizability of the proposed method. All source code, pretrained model checkpoints, and experimental protocols are publicly released to ensure full reproducibility in accordance with FAIR data principles. MSC Classes: 62H30, 68T05, 62G35