Machine Learning for Building Code Waiver Assessment: A Predictive Analytics Framework from 197 Singapore BCA Cases (2021–2023)

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Building code waiver assessments in Singapore remain largely discretionary, relying on case officers’ subjective judgement with limited decision-support tooling. This study presents the first machine learning framework for predicting building code waiver outcomes, trained on 197 historically decided cases from the Building and Construction Authority (BCA) across five waiver categories: barrier-free accessibility (n = 45), ventilation (n = 61), staircase design (n = 37), safety provisions (n = 30), and structural modifications (n = 24), spanning 2021 to 2023. Fourteen engineered features, including documentation completeness, technical justification quality, and compliance history, were extracted through domain-expert annotation. Four models were evaluated: L2-regularised logistic regression, random forest, gradient boosting (XGBoost 2.0.1), and a weighted ensemble. The ensemble achieved the highest predictive accuracy of 83.7% (95% CI: 79.2–88.1%) with an area under the receiver operating characteristic curve (AUC) of 0.891 (95% CI: 0.854–0.928), significantly outperforming all individual models (McNemar’s test, p < 0.05). SHAP analysis revealed that documentation completeness and technical justification quality collectively account for 55% of prediction variance. A companion five-by-five risk assessment matrix, combining predicted rejection probability with consequence severity, stratified cases into actionable risk tiers correlating with observed approval rates ranging from 90.3% (very low risk) to 10.0% (very high risk; Spearman rho = −0.71, p < 0.001). Performance varied across waiver categories: ventilation waivers achieved the highest balanced accuracy (87.1%) while safety waivers proved most challenging (balanced accuracy 64.3%, sensitivity 40.0%). The framework offers a transparent, data-driven decision-support complement to regulatory judgement, learning patterns from historically decided applications within the 2021–2023 BCA context, and demonstrates feasibility for integration into Singapore’s Corenet X digital building submission platform. These five waiver categories serve as domain stratification variables. The machine learning target variable is the binary regulatory outcome: Approved (46.2% of cases) or Rejected (53.8%).

Article activity feed