Explainable AI for Credit Risk and Customer Segmentation in Subprime Lending: A Comprehensive Framework with Implementation Protocol

Aishwary Bodhale

doi:10.14445/22312803/IJCTT-V73I12P105

Research Article | Open Access | Download PDF

Volume 73 | Issue 12 | Year 2025 | Article Id. IJCTT-V73I12P105 | DOI : https://doi.org/10.14445/22312803/IJCTT-V73I12P105

Explainable AI for Credit Risk and Customer Segmentation in Subprime Lending: A Comprehensive Framework with Implementation Protocol

Aishwary Bodhale

Received	Revised	Accepted	Published
25 Oct 2025	29 Nov 2025	10 Dec 2025	27 Dec 2025

Citation :

Aishwary Bodhale, "Explainable AI for Credit Risk and Customer Segmentation in Subprime Lending: A Comprehensive Framework with Implementation Protocol," International Journal of Computer Trends and Technology (IJCTT), vol. 73, no. 12, pp. 27-39, 2025. Crossref, https://doi.org/10.14445/22312803/IJCTT-V73I12P105

Abstract

Subprime lending sets a conflict of operations between predictive accuracy and regulatory transparency. Although advanced machine learning methods are more efficient in default prediction, they have low interpretability, which limits their use in regulated credit systems. The paper presents a combined explainable artificial intelligence system that builds on eXtreme Gradient Boosting (XGBoost) and SHAP (SHapley Additive exPlanations) to assist in transparent credit risk measurements and explanation-based customer segmentation. The framework comes up with three contributions, namely maintaining high predictive performance and allowing instance-level explanations, clustering of borrowers around explanation vectors instead of risk scores, and generating regulation-consistent adverse action notices automatically. Assessment using a simulated dataset of 125,000 subprime loan applications, under realistically simulated conditions, shows that the proposed algorithm has a competitive predictive accuracy, a silhouette score of 0.61 in the explanation-based segmentation, a 12.7% reduction in the default, and an 8.9% increase in the approval of credit-worthy applicants. These results suggest that explainability is an operational capability that can be adopted to improve regulatory compliance and lending in subprime credit markets.

Keywords

Explainable AI, XGBoost, SHAP, Credit Risk Modelling, Subprime Lending, Customer Segmentation, Algorithmic Fairness, Regulatory Compliance.

References

[1] Consumer Financial Protection Bureau, The Consumer Credit Card Market, Washington, DC, 2023. [Online]. Available: https://www.consumerfinance.gov/data-research/research-reports/the-consumer-credit-card-market/

[2] The Use of Machine Learning for Credit Underwriting, FinRegLab, 2021. [Online]. Available: https://finreglab.org/wp-content/uploads/2023/12/FinRegLab_2021-09-16_Research-Report_The-Use-of-Machine-Learning-for-Credit-Underwriting_Market-and-Data-Science-Context.pdf

[3] Stefan Lessmann et al., “Benchmarking State-of-the-Art Classification Algorithms for Credit Scoring: An Update of Research,” European Journal of Operational Research, vol. 247, no. 1, pp. 124-136, 2015.
[CrossRef] [Google Scholar] [Publisher Link]

[4] Equal Credit Opportunity Act, 1974. [Google Scholar]

[5] Solon Barocas, and Andrew D. Selbst, “Big Data's Disparate Impact,” California Law Review, vol. 104, pp. 671-732, 2014.
[CrossRef] [Google Scholar] [Publisher Link]

[6] Cynthia Rudin, “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead,” Nature Machine Intelligence, vol. 1, no. 5, pp. 206-215, 2019.
[CrossRef] [Google Scholar] [Publisher Link]

[7] Amina Adadi, and Mohammed Berrada, “Peeking Inside the Black Box: A Survey on Explainable Artificial Intelligence (XAI),” IEEE Access, vol. 6, pp. 52138-52160, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[8] R.A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Annals of Eugenics, vol. 7, no. 2, pp. 179-188, 1936.
[CrossRef] [Google Scholar] [Publisher Link]

[9] Edward I. Altman, “Financial Ratios, Discrimination Analysis and the Prediction of Corporate Bankruptcy,” The Journal of Finance, vol. 23, no. 4, pp. 589-609, 1968.
[CrossRef] [Google Scholar] [Publisher Link]

[10] Lyn C. Thomas, “A Survey of Credit and Behavioral Scoring: Forecasting Financial Risk of Lending to Consumers,” International Journal of Forecasting, vol. 16, no. 2, pp. 149-172, 2000.
[CrossRef] [Google Scholar] [Publisher Link]

[11] D.J. Hand, and W.E. Henley, “Statistical Classification Methods in Consumer Credit Scoring: A Review,” Journal of the Royal Statistical Society: Series A (Statistics in Society), vol. 160, no. 3, pp. 523-541, 1997.
[CrossRef] [Google Scholar] [Publisher Link]

[12] Leo Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[CrossRef] [Google Scholar] [Publisher Link]

[13] Tianqi Chen, and Carlos Guestrin, “XGBoost: A Scalable Tree Boosting System,” Proceedings of the 22^nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785-794, 2016.
[CrossRef] [Google Scholar] [Publisher Link]

[14] Christoph Molnar, Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, Chapman and Hall/CRC, 2022.
[Publisher Link]

[15] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin, ““Why should I Trust you?” Explaining the Predictions of Any Classifier,” Proceedings of the 22^nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135-1144, 2016.
[CrossRef] [Google Scholar] [Publisher Link]

[16] Scott M. Lundberg, and Su-In Lee, “A Unified Approach to Interpreting Model Predictions,” Advances in Neural Information Processing Systems, vol. 30, pp. 4765-4774, 2017.
[Google Scholar] [Publisher Link]

[17] Niklas Bussmann et al., “Explainable Machine Learning in Credit Risk Management,” Computational Economics, vol. 57, no. 2, pp. 203-216, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[18] Branka Hadji Misheva et al., “Explainable AI for Credit Risk Management, arXiv preprint, pp. 1-16, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[19] Moritz Hardt, Eric Price, and Nathan Srebro, “Equality of Opportunity in Supervised Learning,” Advances in Neural Information Processing Systems, vol. 29, pp. 3323-3331, 2016.
[Google Scholar] [Publisher Link]

[20] Ninareh Mehrabi et al., “A Survey on Bias and Fairness in Machine Learning,” ACM Computing Surveys, vol. 54, no. 6, pp. 1-35, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[21] Andreas Fuster et al., “Predictably Unequal? The Effects of Machine Learning on Credit Markets,” The Journal of Finance, vol. 77, no. 1, pp. 5-47, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[22] Michel Wedel, and Wagner A. Kamakura, Market Segmentation: Conceptual and Methodological Foundations, Springer Science & Business Media, 2000.
[Google Scholar] [Publisher Link]

[23] Alfred Ultsch, and Jörn Lötsch, “Machine-Learned Cluster Identification in High-Dimensional Data,” Journal of Biomedical Informatics, vol. 66, pp. 95-104, 2017.
[CrossRef] [Google Scholar] [Publisher Link]

[24] Ricardo Fraiman, Badih Ghattas, and Marcela Svarc, “Interpretable Clustering using Unsupervised Binary Trees,” Advances in Data Analysis and Classification, vol. 7, pp. 125-145, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[25] Iain Brown, and Christophe Mues, “An Experimental Comparison of Classification Algorithms for Imbalanced Credit Scoring Data Sets,” Expert Systems with Applications, vol. 39, no. 3, pp. 3446-3453, 2012.
[CrossRef] [Google Scholar] [Publisher Link]

[26] Simeon Djankov, Caralee McLiesh, and Andrei Shleifer, “Private Credit in 129 Countries,” Journal of Financial Economics, vol. 84, no. 2, pp. 299-329, 2007.
[CrossRef] [Google Scholar] [Publisher Link]

[27] Scott M. Lundberg, Gabriel G. Erion, and Su-In Lee, “Consistent Individualized Feature Attribution for Tree Ensembles,” arXiv preprint arXiv:1802.03888, pp. 1-9, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[28] Xolani Dastile, Turgay Celik, and Moshe Potsane, “Statistical and Machine Learning Models in Credit Scoring: A Systematic Literature Survey,” Applied Soft Computing, vol. 91, 2020.
[CrossRef] [Google Scholar] [Publisher Link]