Predicting sustainability performance in construction projects using machine learning: a comparative study
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The construction sector plays a major role in global environmental degradation, contributing significantly to carbon emissions, energy consumption, and waste generation. Despite this urgency, limited studies have explored predictive modelling of sustainability performance using survey-based project data, particularly within Saudi Arabia. This study addresses this gap by applying supervised machine learning techniques to predict carbon emissions and classify projects into emission-level categories. A structured survey generated 150 validated responses from key stakeholders across major Saudi cities, covering 19 project and sustainability attributes. Three machine learning models, Support Vector Machine (SVM), Random Forest (RF), and Extreme Gradient Boosting (XGB) were trained and evaluated using nested 10 × 5-fold cross-validation. RF achieved the strongest regression performance (mean CV R 2 = 0.439 ± 0.247; test R 2 = 0.734) and the highest classification accuracy (0.790 ± 0.094 CV; 78% test), outperforming SVM and XGB. SHAP analysis consistently identified waste generation, energy consumption, and project duration as the most influential predictors of carbon emissions. The findings deliver a data-driven framework for early sustainability assessment and support informed policy and planning aligned with Saudi Vision 2030.