Credit Risk Prediction with Self-Supervised Learning: An Explainable AI Approach Integrating SHAP and LIME

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Precise and understandable credit risk prediction is critical for banks to reduce losses, make optimal lending decisions, and gain portfolio stability. Current machine learning techniques fail to address class imbalance, sophisticated borrower behavior, and low interpretability, thus restricting real-world applicability in banking settings. We introduce a new ensemble-based approach to P2P loan default prediction using SIMCLR embeddings and feature extraction. The model is capable of modeling well high-order patterns in borrower data and balancing default and non-default cases. To increase robustness and interpretability, SHAP and LIME are combined to yield global and local explanations consistent with expert financial evaluation. To the best of our knowledge, this work is the first to combine self-supervised embeddings and explainable AI for end-to-end credit risk estimation. Reproducibility is obtained through several independent runs with different random seeds, reporting the mean ± standard deviation of all the most important metrics. Testing on the Credit Risk dataset shows overall accuracy of 92.51%, recall of 91.02%, and F1-score of 90.74%. Per-class results show good performance for Non-Default (precision 91.88%, recall 99.34%, F1 95.46%) and Default (precision 96.30%, recall 66.33%, F1 78.55%), indicating good handling of imbalanced data. The framework surpasses standard baselines through high prediction accuracy, interpretability, and realistic scalability, giving it a valid solution for transparent, automated, and reliable credit risk assessment.

Article activity feed