Transformer-based NLP Approaches for Credit Risk Prediction: A Systematic Review
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This systematic review explores how Natural Language Processing (NLP) and Large Language Models (LLMs), such as BERT, RoBERTa, and LLaMA, are applied to enhance credit risk classification. Traditional models primarily utilize structured data, but the rising availability of unstructured text has opened new avenues for analysis. We employed a systematic literature review methodology across Scopus, ScienceDirect, and Web of Science, guided by PRISMA principles, and filtered 284 studies down to 63 through semantic similarity scoring. Results reveal transformer-based models substantially improve predictive accuracy, especially when hybridized with temporal and sentiment-aware components. We further discuss ethical implications, including algorithmic bias and regulatory compliance. The review emphasizes the transformative potential of NLP in financial modeling while identifying interpretability and data governance as ongoing challenges.