A Comparative Study of Machine Learning and Deep Learning Models for Predicting Medical Insurance Costs with Explainable AI

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Estimating medical insurance costs accurately is essential for efficient and transparent health care with a focus on health system improvements and sustainability. The objective of this paper is to investigate and assess the predictive performance of numerous machine learning and deep learning models (including Linear Regression, Random Forest, & XGBoost) and a neural network (ANN) using demographic and health-related features to predict individual medical costs. The dataset was pre-processed, including feature engineering to improve the models’ performance. Of all the models, the ANN performed the best with an R-squared value of 0.88 on the test dataset and achieved a mean R-squared of 0.9886 over five-fold cross-validation. We also used SHAP (SHapley Additive exPlanations) to support an interpretable approach to the predictions from the ANN and found significant predictors of individual medical costs were age, BMI, and smoking status. The results demonstrate both accuracy and interpretability as investments in reliable and transparent AI applications for predicting health costs.

Article activity feed