A hybrid regression framework for estimating urban residential community drainage infrastructure: Model optimization and recursive prediction
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Urban drainage infrastructure plays a vital role in maintaining water quality and mitigating urban flood risks. However, estimating the quantity of residential drainage facilities in megacities remains challenging due to infrastructure aging, fragmented management, and data deficiencies. This study proposes a hybrid regression framework combining multiple linear regression (MLR), support vector regression (SVR), and random forest regression (RFR) to improve prediction accuracy of drainage facility quantities at the residential community scale. Based on data from 120 residential communities in City S, the study analyzes correlations between drainage facilities (e.g., pipeline length, number of manholes) and community attributes (e.g., building area, number of buildings, households), and incorporates a recursive forecasting mechanism to enhance multi-step estimation accuracy. The MLR model was optimized using weighted least squares to address heteroscedasticity, while SVR and RFR parameters were tuned via grid search and ten-fold cross-validation. The results indicated that the weighted linear regression (WLR) model performs best in predicting pipeline lengths, while the SVR model achieved higher accuracy in predicting the number of manholes. The proposed modeling framework offers reliable data support and a practical methodology for planning and fine-tuning the management of drainage systems in highly heterogeneous megacities.