Comparative Performance of Machine Learning Models for Landslide Susceptibility Assessment: Impact of Sampling Strategies in Highway Buffer Zone

Zhenyu Tang
Shumao Qiu
Haoying Xia
Daming Lin
Mingzhou Bai

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Landslide susceptibility assessment is critical for hazard mitigation and land-use planning. This study evaluates the impact of two different non-landslide sampling methods—random sampling and sampling constrained by the Global Landslide Hazard Map (GLHM)—on the performance of various machine learning and deep learning models, including Naïve Bayes (NB), Support Vector Machine (SVM), SVM-Random Forest hybrid (SVM-RF), and XGBoost. The study area is a 2 km buffer zone along the Duku Highway in Xinjiang, China, with 102 landslide and 102 non-landslide points extracted by aforementioned sampling methods. Models were tested using ROC curves and non-parametric significance tests based on 20 repetitions of 5-fold spatial cross-validation data. GLHM sampling consistently improved AUROC and accuracy across all models (e.g., AUROC gains: NB +8.44, SVM +7.11, SVM–RF +3.45, XGBoost +3.04; accuracy gains: NB +11.30%, SVM +8.33%, SVM–RF +7.40%, XGBoost +8.31%). XGBoost delivered the best performance under both sampling strategies, reaching 94.61% AUROC and 84.30% accuracy with GLHM sampling. SHAP analysis showed that GLHM sampling stabilized feature importance rankings, highlighting STI, TWI, and NDVI as the main controlling factors for landslides in the study area. These results highlight the importance of hazard-informed sampling to enhance landslide susceptibility modeling accuracy and interpretability.

Version published to 10.3390/app15158416
Jul 29, 2025
Version published to 10.20944/preprints202507.0119.v1
Jul 2, 2025

Integrating Machine Learning and Participatory GIS with Multi-Temporal Remote Sensing for Flood Susceptibility and Vulnerability Mapping in Nkhotakota, Malawi

This article has 1 author:
1. Japhet Khendlo
This article has no evaluationsLatest version Feb 3, 2026
Research on the Problem of Spatial Heterogeneity in Row Data and Generalization Capability for Landslide Susceptibility Assessment using the Physics-constrained U-net Model

This article has 2 authors:
1. Heli Zhang
2. Hongyan Deng
This article has no evaluationsLatest version Feb 4, 2026
Machine Learning Driven Land Surface Temperature Prediction and Urban Heat Risk Assessment in The Gambia

This article has 5 authors:
1. Rodrigue Samb
2. Adyasha Jena
3. S. Manavvi
4. Uttam Kumar Roy
5. Basant Yadav
This article has no evaluationsLatest version Dec 18, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Integrating Machine Learning and Participatory GIS with Multi-Temporal Remote Sensing for Flood Susceptibility and Vulnerability Mapping in Nkhotakota, Malawi

Research on the Problem of Spatial Heterogeneity in Row Data and Generalization Capability for Landslide Susceptibility Assessment using the Physics-constrained U-net Model

Machine Learning Driven Land Surface Temperature Prediction and Urban Heat Risk Assessment in The Gambia