Interpretable Machine Learning Reveals Synergy-Gain Windows and Dual-Objective Mix-Proportion Boundaries for Compressive Strength and Peak Strain in Hybrid Steel–PVA Fiber-Reinforced Concrete
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Hybrid steel–PVA fiber-reinforced concrete offers promise for enhancing both load-bearing capacity and deformation capacity. However, the coupled effects of fiber parameters and volume-fraction combinations on compressive strength (σc) and peak strain (εc) are still not fully understood. A unified, interpretable, and engineering-oriented quantitative framework is still lacking. This study compiled experimental data from 26 published literature, building a multi-source database consisting of 397 datasets for σc and 203 datasets for εc. Based on this database, a comprehensive analytical framework was proposed, including model prediction, SHAP-based interpretation, Monte Carlo marginalization, synergy-gain window determination, and dual-objective mix-proportion optimization. For σc prediction, LightGBM achieved the highest test-set R2 (0.9783), whereas CatBoost showed more robust error control (MAE = 2.7409 MPa). CatBoost was therefore selected as the base model for the subsequent interpretation analysis. For εc prediction, Bayesian-optimized CatBoost achieved the best test performance (R2 = 0.9659, MAE = 0.0218, RMSE = 0.0358), while the transfer-learning model reached a comparable accuracy level (R2 = 0.9650). SHAP analysis revealed that σc is mainly governed by matrix mix-proportion factors and steel fiber volume fraction, whereas εc is more sensitive to S/B and PVA-related variables. The mean synergy-gain maps generated via Monte Carlo marginalization and two-dimensional grid evaluation further showed clear differences between the two targets. Positive synergy in σc was highly localized. Its maximum mean synergy gain was 4.7949 MPa at (Steel, PVA) = (1.875%, 2.000%). By contrast, εc exhibited a wider positive-synergy region, with a peak value of 0.0141629 at (0.38%, 1.62%). Therefore, the engineering output of this study is not a single optimal mix point. Instead, it is a set of candidate windows for different performance targets, together with boundary-risk identification and priorities for experimental validation.