Research Article | Open Access | Download PDF
Volume 73 | Issue 12 | Year 2025 | Article Id. IJCTT-V73I12P105 | DOI : https://doi.org/10.14445/22312803/IJCTT-V73I12P105Explainable AI for Credit Risk and Customer Segmentation in Subprime Lending: A Comprehensive Framework with Implementation Protocol
Aishwary Bodhale
| Received | Revised | Accepted | Published |
|---|---|---|---|
| 25 Oct 2025 | 29 Nov 2025 | 10 Dec 2025 | 27 Dec 2025 |
Citation :
Aishwary Bodhale, "Explainable AI for Credit Risk and Customer Segmentation in Subprime Lending: A Comprehensive Framework with Implementation Protocol," International Journal of Computer Trends and Technology (IJCTT), vol. 73, no. 12, pp. 27-39, 2025. Crossref, https://doi.org/10.14445/22312803/IJCTT-V73I12P105
Abstract
Subprime lending sets a conflict of operations between predictive accuracy and regulatory transparency. Although advanced machine learning methods are more efficient in default prediction, they have low interpretability, which limits their use in regulated credit systems. The paper presents a combined explainable artificial intelligence system that builds on eXtreme Gradient Boosting (XGBoost) and SHAP (SHapley Additive exPlanations) to assist in transparent credit risk measurements and explanation-based customer segmentation. The framework comes up with three contributions, namely maintaining high predictive performance and allowing instance-level explanations, clustering of borrowers around explanation vectors instead of risk scores, and generating regulation-consistent adverse action notices automatically. Assessment using a simulated dataset of 125,000 subprime loan applications, under realistically simulated conditions, shows that the proposed algorithm has a competitive predictive accuracy, a silhouette score of 0.61 in the explanation-based segmentation, a 12.7% reduction in the default, and an 8.9% increase in the approval of credit-worthy applicants. These results suggest that explainability is an operational capability that can be adopted to improve regulatory compliance and lending in subprime credit markets.
Keywords
Explainable AI, XGBoost, SHAP, Credit Risk Modelling, Subprime Lending, Customer Segmentation, Algorithmic Fairness, Regulatory Compliance.
References
[1] Consumer Financial Protection Bureau, The Consumer Credit
Card Market, Washington, DC, 2023. [Online]. Available: https://www.consumerfinance.gov/data-research/research-reports/the-consumer-credit-card-market/
[2] The Use of Machine Learning for Credit Underwriting,
FinRegLab, 2021. [Online]. Available:
https://finreglab.org/wp-content/uploads/2023/12/FinRegLab_2021-09-16_Research-Report_The-Use-of-Machine-Learning-for-Credit-Underwriting_Market-and-Data-Science-Context.pdf
[3] Stefan Lessmann et al., “Benchmarking State-of-the-Art
Classification Algorithms for Credit Scoring: An Update of Research,” European Journal of Operational Research,
vol. 247, no. 1, pp. 124-136, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Equal Credit Opportunity Act, 1974. [Google Scholar]
[5] Solon Barocas, and Andrew D. Selbst, “Big Data's Disparate
Impact,” California Law Review,
vol. 104, pp. 671-732, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Cynthia Rudin, “Stop Explaining Black Box Machine Learning
Models for High Stakes Decisions and Use Interpretable Models Instead,” Nature Machine Intelligence, vol. 1, no.
5, pp. 206-215, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Amina Adadi, and Mohammed Berrada, “Peeking Inside the Black
Box: A Survey on Explainable Artificial Intelligence (XAI),” IEEE Access, vol. 6, pp. 52138-52160,
2018.
[CrossRef] [Google Scholar] [Publisher Link]
[8] R.A. Fisher, “The Use of Multiple Measurements in Taxonomic
Problems,” Annals of Eugenics,
vol. 7, no. 2, pp. 179-188, 1936.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Edward I. Altman, “Financial Ratios, Discrimination Analysis
and the Prediction of Corporate Bankruptcy,” The Journal of Finance, vol. 23, no. 4, pp. 589-609, 1968.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Lyn C. Thomas, “A Survey of
Credit and Behavioral Scoring: Forecasting Financial Risk of Lending to
Consumers,” International Journal of
Forecasting, vol. 16, no. 2, pp. 149-172, 2000.
[CrossRef] [Google Scholar] [Publisher Link]
[11] D.J. Hand, and W.E. Henley,
“Statistical Classification Methods in Consumer Credit Scoring: A
Review,” Journal of the Royal
Statistical Society: Series A (Statistics in Society), vol. 160, no. 3, pp.
523-541, 1997.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Leo Breiman, “Random
Forests,” Machine Learning, vol.
45, no. 1, pp. 5-32, 2001.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Tianqi Chen, and Carlos
Guestrin, “XGBoost: A Scalable Tree Boosting System,” Proceedings of the 22nd ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining, pp. 785-794, 2016.
[CrossRef] [Google Scholar] [Publisher
Link]
[14] Christoph Molnar, Interpretable Machine Learning: A Guide for
Making Black Box Models Explainable, Chapman and Hall/CRC, 2022.
[Publisher Link]
[15] Marco Tulio Ribeiro, Sameer
Singh, and Carlos Guestrin, ““Why should I Trust you?” Explaining the
Predictions of Any Classifier,” Proceedings
of the 22nd ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, pp. 1135-1144, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Scott M. Lundberg, and
Su-In Lee, “A Unified Approach to Interpreting Model Predictions,” Advances in Neural Information Processing
Systems, vol. 30, pp. 4765-4774, 2017.
[Google Scholar] [Publisher Link]
[17] Niklas Bussmann et al.,
“Explainable Machine Learning in Credit Risk Management,” Computational Economics, vol. 57, no. 2,
pp. 203-216, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Branka Hadji Misheva et
al., “Explainable AI for Credit Risk Management, arXiv preprint, pp. 1-16, 2021.
[CrossRef] [Google Scholar] [Publisher
Link]
[19] Moritz Hardt, Eric Price,
and Nathan Srebro, “Equality of Opportunity in Supervised Learning,” Advances in Neural Information Processing
Systems, vol. 29, pp. 3323-3331, 2016.
[Google Scholar] [Publisher
Link]
[20] Ninareh Mehrabi et al., “A
Survey on Bias and Fairness in Machine Learning,” ACM Computing Surveys, vol. 54, no. 6, pp. 1-35, 2021.
[CrossRef] [Google Scholar] [Publisher
Link]
[21] Andreas Fuster et al.,
“Predictably Unequal? The Effects of Machine Learning on Credit Markets,” The Journal of Finance, vol. 77, no. 1,
pp. 5-47, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Michel Wedel, and Wagner A.
Kamakura, Market Segmentation:
Conceptual and Methodological Foundations, Springer Science & Business
Media, 2000.
[Google Scholar] [Publisher Link]
[23] Alfred Ultsch, and Jörn
Lötsch, “Machine-Learned Cluster Identification in High-Dimensional
Data,” Journal of Biomedical
Informatics, vol. 66, pp. 95-104, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Ricardo Fraiman, Badih
Ghattas, and Marcela Svarc, “Interpretable Clustering using Unsupervised Binary
Trees,” Advances in Data Analysis
and Classification, vol. 7, pp.
125-145, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Iain Brown, and Christophe
Mues, “An Experimental Comparison of Classification Algorithms for Imbalanced
Credit Scoring Data Sets,” Expert Systems
with Applications, vol. 39, no. 3, pp. 3446-3453, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Simeon Djankov, Caralee
McLiesh, and Andrei Shleifer, “Private Credit in 129 Countries,” Journal of Financial Economics, vol. 84,
no. 2, pp. 299-329, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Scott M. Lundberg, Gabriel
G. Erion, and Su-In Lee, “Consistent Individualized Feature Attribution for
Tree Ensembles,” arXiv preprint
arXiv:1802.03888, pp. 1-9, 2018.
[CrossRef] [Google Scholar] [Publisher
Link]
[28] Xolani Dastile, Turgay Celik, and Moshe Potsane, “Statistical
and Machine Learning Models in Credit Scoring: A Systematic Literature Survey,”
Applied Soft Computing, vol. 91,
2020.
[CrossRef] [Google Scholar] [Publisher Link]