A Machine Learning Framework for Customer Segmentation in the Korean Credit Card Industry
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study presents a data-driven framework for segmenting customers in the highly competitive Korean credit card market using a large-scale, anonymized dataset from a leading issuer. We applied a systematic feature reduction process, reducing an initial set of 565 variables to 138 informative attributes. Principal Component Analysis was then employed to transform these features into three interpretable dimensions: Spending Volume, Credit & Loan Dependency, and Membership & Credit. We evaluated multiple clustering algorithms, including K-means, Hierarchical Clustering, and Self-Organizing Maps, finding that K-means clustering with three segments provided the highest internal validity and clearest interpretability along the value-risk axes. The analysis identified three distinct customer segments: (1) High-Value, Low-Risk Customers characterized by high spending and stable repayment; (2) Low-Value, Low-Risk Customers, representing the largest, most conservative segment; and (3) High-Risk Customers, who exhibit active spending but a high dependency on loans and installments, coupled with a higher delinquency rate yet long membership tenure. Our findings provide actionable managerial implications for differentiated strategies in value creation, customer activation, and risk-aware relationship management. To the best of our knowledge, this is the first empirical study to segment customers using actual behavioral data from the Korean credit card industry, offering a practical model for precision marketing and risk management in the digital finance era.