Risk-Informed Machine Learning Models for Renewal Classification in Motor Insurance
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate prediction of motor-insurance policy renewals is essential for pricing, customer retention, and operational decision-making in modern digital insurance ecosystems. This study develops an interpretable intelligent system for classifying Type 1 motor-insurance policy renewals using a real-world portfolio of 70,290 private-car policies from Thailand. Five machine-learning models including Binary Logistic Regression, K-Nearest Neighbors, Support Vector Machines, Random Forests, and XGBoost are systematically evaluated across multiple curated feature sets generated through statistical filtering, stepwise selection, and permutation-based importance. Non-parametric statistical tests are employed to compare model performance across scenarios. Experimental results show that a reduced four-feature Random Forest model (car age, net premium, sum insured, and car group) achieves the highest predictive performance (AUC = 0.9962; F1 = 0.9815), outperforming full-feature models while maintaining superior computational efficiency. To ensure transparency and regulatory alignment, a SHAP-based explainability layer is integrated to quantify the marginal influence of each predictor on renewal decisions, revealing strong behavioral and pricing effects associated with vehicle age and premium structure. The proposed system provides interpretable, scalable, and deployment-ready insights for insurers, supporting dynamic pricing, risk-adjusted retention strategies, and digital customer engagement. The findings demonstrate how efficient and transparent ML-driven intelligent systems can enhance decision support in rapidly evolving motor-insurance markets.