A Comparative Study of TabNet and Classical Machine Learning Models for Landslide Prediction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Landslides are a major geohazard that endanger human life, infrastructure, and ecosystems, highlighting the need for accurate susceptibility mapping to support proactive disaster risk management. Traditional machine learning models such as Random Forest (RF), Support Vector Machine (SVM), Artificial Neural Networks (ANN), and XGBoost have shown utility but often struggle to capture the complex, high-dimensional interactions among heterogeneous geospatial factors. This study introduces an advanced framework based on TabNet , a deep learning architecture optimized for tabular data, which leverages sequential attention and interpretable decision steps to effectively model intricate feature relationships. A synthetic yet realistic dataset of 180 samples and 21 conditioning factors—including slope, elevation, rainfall, land use, lithology, and proximity to faults and rivers—was constructed from geospatial patterns reported in 17 peer-reviewed studies. TabNet was benchmarked against four classical models. Results show that TabNet achieved the highest predictive performance with an accuracy of 87%, AUC-ROC of 0.92, and F1-score of 0.82, outperforming all baseline models. Feature importance analysis identified slope, rainfall intensity, and land cover as the most critical predictors of landslide occurrence. Moreover, TabNet demonstrated strong generalization across diverse synthetic terrains resembling the Himalayas, the Alps, Southeast Asia, and the Zagros Mountains, while maintaining low misclassification rates and competitive training efficiency. These findings highlight TabNet’s robustness, interpretability, and superior predictive capacity, positioning it as a promising tool for geospatial hazard assessment. Future work will focus on integrating temporal rainfall records, high-resolution remote sensing, and real-time seismic data to enhance predictive responsiveness and enable operational deployment in early warning systems.