Optimized Machine Learning for Insurance Cost Prediction

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Machine learning (ML) is becoming more common in the insurance industry to predict costs and help set prices. Accurate predictions help insurance companies set fair prices while keeping insurance affordable for customers. However, many ML models are difficult to understand, making it unclear how they make decisions. This study focuses on improving prediction accuracy and making models easier to interpret by using hyperparameter tuning with Optuna and feature importance analysis with SHAP (SHapley Additive Explanations). Three models—Ridge Regression, Random Forest, and XGBoost—were optimized and tested. The results show that XGBoost performed the best, with a median Rsquared of \textbf{0.8655} and RMSE of \textbf{4136.59}. SHAP analysis found that \textbf{smoking status, BMI, and age} were the most important factors affecting insurance costs. These findings show that using both model tuning and explainability tools helps improve ML models for insurance pricing.

Article activity feed