Inherently interpretable machine learning for credit scoring: Optimal classification tree with hyperplane splits

My bibliography Save this article

Inherently interpretable machine learning for credit scoring: Optimal classification tree with hyperplane splits

Author

Listed:

Tu, Jiancheng
Wu, Zhibin

Registered:

Abstract

An accurate and interpretable credit scoring model plays a crucial role in helping financial institutions reduce losses by promptly detecting, containing, and preventing defaulters. However, existing models often face a trade-off between interpretability and predictive accuracy. Traditional models like Logistic Regression (LR) offer high interpretability but may have limited predictive performance, while more complex models may improve accuracy at the expense of interpretability. In this paper, we tackle the credit scoring problem with imbalanced data by proposing two new classification models based on the optimal classification tree with hyperplane splits (OCT-H). OCT-H provides transparency and easy interpretation with ‘if-then’ decision tree rules. The first model, the cost-sensitive optimal classification tree with hyperplane splits (CSOCT-H). The second model, the optimal classification tree with hyperplane splits based on maximizing F1-Score (OCT-H-F1), aims to directly maximize the F1-score. To enhance model scalability, we introduce a data sample reduction method using data binning and feature selection. We then propose two solution methods: a heuristic approach and a method utilizing warm-start techniques to accelerate the solving process. We evaluated the proposed models on four public datasets. The results show that OCT-H significantly outperforms traditional interpretable models, such as Decision Trees (DT) and Logistic Regression (LR), in both predictive performance and interpretability. On certain datasets, OCT-H performs as well as or better than advanced ensemble tree models, effectively narrowing the gap between interpretable models and black-box models.

Suggested Citation

Tu, Jiancheng & Wu, Zhibin, 2025. "Inherently interpretable machine learning for credit scoring: Optimal classification tree with hyperplane splits," European Journal of Operational Research, Elsevier, vol. 322(2), pages 647-664.

Handle: RePEc:eee:ejores:v:322:y:2025:i:2:p:647-664
DOI: 10.1016/j.ejor.2024.10.046

Download full text from publisher

As the access to this document is restricted, you may want to

for a different version of it.

References listed on IDEAS

Blanquero, Rafael & Carrizosa, Emilio & Molero-Río, Cristina & Romero Morales, Dolores, 2020. "Sparsity in optimal randomized classification trees," European Journal of Operational Research, Elsevier, vol. 284(1), pages 255-272.
Ali Aouad & Adam N. Elmachtoub & Kris J. Ferreira & Ryan McNellis, 2023. "Market Segmentation Trees," Manufacturing & Service Operations Management, INFORMS, vol. 25(2), pages 648-667, March.
Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
- Elena Ivona Dumitrescu & Sullivan Hué & Christophe Hurlin & Sessi Tokpavi, 2022. "Machine Learning for Credit Scoring: Improving Logistic Regression with Non Linear Decision Tree Effects," Post-Print hal-03331114, HAL.
Ikram Chaabane & Radhouane Guermazi & Mohamed Hammami, 2020. "Enhancing techniques for learning decision trees from imbalanced data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(3), pages 677-745, September.
Liu, Wanan & Fan, Hong & Xia, Meng, 2023. "Tree-based heterogeneous cascade ensemble model for credit scoring," International Journal of Forecasting, Elsevier, vol. 39(4), pages 1593-1614.
Dragos Florin Ciocan & Velibor V. Mišić, 2022. "Interpretable Optimal Stopping," Management Science, INFORMS, vol. 68(3), pages 1616-1638, March.
Luo, Jian & Yan, Xin & Tian, Ye, 2020. "Unsupervised quadratic surface support vector machine with application to credit risk assessment," European Journal of Operational Research, Elsevier, vol. 280(3), pages 1008-1017.
Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
Medina-Olivares, Victor & Calabrese, Raffaella & Crook, Jonathan & Lindgren, Finn, 2023. "Joint models for longitudinal and discrete survival data in credit scoring," European Journal of Operational Research, Elsevier, vol. 307(3), pages 1457-1473.
Wang, Zheqi & Crook, Jonathan & Andreeva, Galina, 2020. "Reducing estimation risk using a Bayesian posterior distribution approach: Application to stress testing mortgage loan default," European Journal of Operational Research, Elsevier, vol. 287(2), pages 725-738.
V L Miguéis & D F Benoit & D Van den Poel, 2013. "Enhanced decision support in credit scoring using Bayesian binary quantile regression," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 64(9), pages 1374-1383, September.
- V. L. Migu Is & D. F. Benoit & D. Van Den Poel, 2012. "Enhanced Decision Support in Credit Scoring Using Bayesian Binary Quantile Regression," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 12/803, Ghent University, Faculty of Economics and Business Administration.
Lara Marie Demajo & Vince Vella & Alexiei Dingli, 2020. "Explainable AI for Interpretable Credit Scoring," Papers 2012.03749, arXiv.org.
Verbraken, Thomas & Bravo, Cristián & Weber, Richard & Baesens, Bart, 2014. "Development and application of consumer credit scoring models using profit-based classification measures," European Journal of Operational Research, Elsevier, vol. 238(2), pages 505-513.
Gunnarsson, Björn Rafn & vanden Broucke, Seppe & Baesens, Bart & Óskarsdóttir, María & Lemahieu, Wilfried, 2021. "Deep learning for credit scoring: Do or don’t?," European Journal of Operational Research, Elsevier, vol. 295(1), pages 292-305.
B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.
Höppner, Sebastiaan & Baesens, Bart & Verbeke, Wouter & Verdonck, Tim, 2022. "Instance-dependent cost-sensitive learning for detecting transfer fraud," European Journal of Operational Research, Elsevier, vol. 297(1), pages 291-300.
George Petrides & Darie Moldovan & Lize Coenen & Tias Guns & Wouter Verbeke, 2022. "Cost-sensitive learning for profit-driven credit scoring," Journal of the Operational Research Society, Taylor & Francis Journals, vol. 73(2), pages 338-350, March.
Edward I. Altman, 1968. "Financial Ratios, Discriminant Analysis And The Prediction Of Corporate Bankruptcy," Journal of Finance, American Finance Association, vol. 23(4), pages 589-609, September.
Michael Bücker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2022. "Transparency, auditability, and explainability of machine learning models in credit scoring," Journal of the Operational Research Society, Taylor & Francis Journals, vol. 73(1), pages 70-90, January.
Chen, Yujia & Calabrese, Raffaella & Martin-Barragan, Belen, 2024. "Interpretable machine learning for imbalanced credit scoring datasets," European Journal of Operational Research, Elsevier, vol. 312(1), pages 357-372.
Christian Lohmann & Thorsten Ohliger, 2019. "The total cost of misclassification in credit scoring: A comparison of generalized linear models and generalized additive models," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 38(5), pages 375-389, August.
Galina Andreeva & Edward I. Altman, 2021. "The Value Of Personal Credit History In Risk Screening Of Entrepreneurs: Evidence From Marketplace Lending," Journal of Financial Management, Markets and Institutions (JFMMI), World Scientific Publishing Co. Pte. Ltd., vol. 9(01), pages 1-25, June.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Shi, Yong & Qu, Yi & Chen, Zhensong & Mi, Yunlong & Wang, Yunong, 2024. "Improved credit risk prediction based on an integrated graph representation learning approach with graph transformation," European Journal of Operational Research, Elsevier, vol. 315(2), pages 786-801.
Tigges, Maximilian & Mestwerdt, Sönke & Tschirner, Sebastian & Mauer, René, 2024. "Who gets the money? A qualitative analysis of fintech lending and credit scoring through the adoption of AI and alternative data," Technological Forecasting and Social Change, Elsevier, vol. 205(C).
Nadia Ayed & Khemaies Bougatef, 2024. "Performance Assessment of Logistic Regression (LR), Artificial Neural Network (ANN), Fuzzy Inference System (FIS) and Adaptive Neuro-Fuzzy System (ANFIS) in Predicting Default Probability: The Case of," Computational Economics, Springer;Society for Computational Economics, vol. 64(3), pages 1803-1835, September.
Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
- Elena Ivona Dumitrescu & Sullivan Hué & Christophe Hurlin & Sessi Tokpavi, 2022. "Machine Learning for Credit Scoring: Improving Logistic Regression with Non Linear Decision Tree Effects," Post-Print hal-03331114, HAL.
Li, Zhe & Liang, Shuguang & Pan, Xianyou & Pang, Meng, 2024. "Credit risk prediction based on loan profit: Evidence from Chinese SMEs," Research in International Business and Finance, Elsevier, vol. 67(PA).
Chi, Guotai & Dong, Bingjie & Zhou, Ying & Jin, Peng, 2024. "Long-horizon predictions of credit default with inconsistent customers," Technological Forecasting and Social Change, Elsevier, vol. 198(C).
Chen, Dangxing & Ye, Jiahui & Ye, Weicheng, 2023. "Interpretable selective learning in credit risk," Research in International Business and Finance, Elsevier, vol. 65(C).
Sullivan Hué, 2022. "GAM(L)A: An econometric model for interpretable machine learning," French Stata Users' Group Meetings 2022 19, Stata Users Group.
De Bock, Koen W. & Coussement, Kristof & Caigny, Arno De & Słowiński, Roman & Baesens, Bart & Boute, Robert N. & Choi, Tsan-Ming & Delen, Dursun & Kraus, Mathias & Lessmann, Stefan & Maldonado, Sebast, 2024. "Explainable AI for Operational Research: A defining framework, methods, applications, and a research agenda," European Journal of Operational Research, Elsevier, vol. 317(2), pages 249-272.
Koen W. de Bock & Kristof Coussement & Arno De Caigny & Roman Slowiński & Bart Baesens & Robert N Boute & Tsan-Ming Choi & Dursun Delen & Mathias Kraus & Stefan Lessmann & Sebastián Maldonado & David , 2023. "Explainable AI for Operational Research: A Defining Framework, Methods, Applications, and a Research Agenda," Post-Print hal-04219546, HAL.
Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
- Elena Dumitrescu & Sullivan Hué & Christophe Hurlin & Sessi Tokpavi, 2021. "Machine Learning or Econometrics for Credit Scoring: Let's Get the Best of Both Worlds," Working Papers hal-02507499, HAL.
Chai, Nana & Abedin, Mohammad Zoynul & Yang, Lian & Shi, Baofeng, 2025. "Farmers' credit risk evaluation with an explainable hybrid ensemble approach: A closer look in microfinance," Pacific-Basin Finance Journal, Elsevier, vol. 89(C).
Yang, Fan & Abedin, Mohammad Zoynul & Hajek, Petr, 2024. "An explainable federated learning and blockchain-based secure credit modeling method," European Journal of Operational Research, Elsevier, vol. 317(2), pages 449-467.
Emmanuel Flachaire & Gilles Hacheme & Sullivan Hu'e & S'ebastien Laurent, 2022. "GAM(L)A: An econometric model for interpretable Machine Learning," Papers 2203.11691, arXiv.org.
Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
Li, Yibei & Wang, Ximei & Djehiche, Boualem & Hu, Xiaoming, 2020. "Credit scoring by incorporating dynamic networked information," European Journal of Operational Research, Elsevier, vol. 286(3), pages 1103-1112.
- Yibei Li & Ximei Wang & Boualem Djehiche & Xiaoming Hu, 2019. "Credit Scoring by Incorporating Dynamic Networked Information," Papers 1905.11795, arXiv.org, revised Oct 2019.
Tsukahara, Fábio Yasuhiro & Kimura, Herbert & Sobreiro, Vinicius Amorim & Zambrano, Juan Carlos Arismendi, 2016. "Validation of default probability models: A stress testing approach," International Review of Financial Analysis, Elsevier, vol. 47(C), pages 70-85.
Sigrist, Fabio & Leuenberger, Nicola, 2023. "Machine learning for corporate default risk: Multi-period prediction, frailty correlation, loan portfolios, and tail probabilities," European Journal of Operational Research, Elsevier, vol. 305(3), pages 1390-1406.
Matthias Bogaert & Lex Delaere, 2023. "Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art," Mathematics, MDPI, vol. 11(5), pages 1-28, February.
Kriebel, Johannes & Stitz, Lennart, 2022. "Credit default prediction from user-generated text in peer-to-peer lending using deep learning," European Journal of Operational Research, Elsevier, vol. 302(1), pages 309-323.

More about this item

Keywords

; ; ; ; ;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:322:y:2025:i:2:p:647-664. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Inherently interpretable machine learning for credit scoring: Optimal classification tree with hyperplane splits

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data