Comparison of ARIMA model, ARIMA-BPNN model and ARIMA-ERNN model in predicting incidence of dengue in China
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Dengue remains an enduring public health concern across tropical and subtropical regions of China, with a disproportionate burden observed in economically disadvantaged areas. Dengue outbreaks can overwhelm healthcare systems and impede economic development. The development of timely and accurate predictive models for dengue incidence is critical for strengthening early warning systems and informing the strategic allocation of public health resources. Methods Monthly reported dengue cases in China from 2004–2024 were publicly available from the Chinese Center for Disease Control and Prevention. For model development and validation, the data were divided into training and testing subsets. Three models were subsequently constructed, namely, the autoregressive integrated moving average (ARIMA) model, a combined model of ARIMA and the back propagation neural network (ARIMA-BPNN), and a combined model of ARIMA and the Elman recurrent neural network (ARIMA-ERNN). The predictive accuracy of each model was evaluated via the mean absolute error (MAE), root mean square error (RMSE), corrected mean absolute percentage error (cMAPE) and coefficient of determination (R²) on both the training and testing sets. Results From 2004–2024, the average incidence of dengue in mainland China was 0.486 cases per 1,000,000 people annually, with yearly rates ranging from 0.031–34.161 cases per 1,000,000 people. While all three models demonstrated adequate performance in fitting the observed data, the ARIMA-BPNN model and the ARIMA-ERNN model consistently outperformed the conventional ARIMA model. Specifically, the ARIMA-BPNN model yielded the lowest MAE, RMSE, and cMAPE values, along with the highest R² across both the training and testing datasets. These findings suggest that the ARIMA-BPNN model possesses an enhanced ability to capture the nonlinear and dynamic patterns inherent in dengue transmission. In contrast, the ARIMA model exhibited reduced accuracy in forecasting peak incidences and abrupt temporal fluctuations, highlighting its limitations in modelling complex epidemiological trends. Conclusion Hybrid modelling approaches that combine ARIMA with neural network architectures, particularly the ARIMA-BPNN model, have demonstrated superior predictive accuracy in forecasting dengue incidence. These findings may contribute to outbreak preparedness and the timeliness and effectiveness of dengue control in China.