IDEAS home Printed from https://ideas.repec.org/a/wly/jforec/v43y2024i2p286-308.html
   My bibliography  Save this article

Credit scoring prediction leveraging interpretable ensemble learning

Author

Listed:
  • Yang Liu
  • Fei Huang
  • Lili Ma
  • Qingguo Zeng
  • Jiale Shi

Abstract

Credit scoring models based on machine learning often need to work on accuracy and interpretability in practical applications. Original KCDWU has a more prominent adaptive property but ignores intra‐class and inter‐class distances in the clustering process, resulting in the possibility of inaccurate identification of class features and cluster structure of data, which compromises the clustering effect. Therefore, we improve the automatic K‐means clustering based on the Calinski–Harabasz index, thus achieving a clustering output for improved results. We also scrutinize representative five single classification models and six ensemble learning models for credit scoring prediction. We empirically test the superior performance of ensemble learning models and identify the best model CatBoost by comparing them based on multiple evaluation indicators. Empirical results reveal that the SHAP method conforms well to CatBoost and delivers a global and local interpretation of the predictions. This work provides financial institutions with a promising candidate for interpretable credit scoring models.

Suggested Citation

  • Yang Liu & Fei Huang & Lili Ma & Qingguo Zeng & Jiale Shi, 2024. "Credit scoring prediction leveraging interpretable ensemble learning," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 43(2), pages 286-308, March.
  • Handle: RePEc:wly:jforec:v:43:y:2024:i:2:p:286-308
    DOI: 10.1002/for.3033
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/for.3033
    Download Restriction: no

    File URL: https://libkey.io/10.1002/for.3033?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Raffaella Calabrese & Paolo Giudici, 2015. "Estimating bank default with generalised extreme value regression models," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 66(11), pages 1783-1792, November.
    2. Yu, Baojun & Li, Changming & Mirza, Nawazish & Umar, Muhammad, 2022. "Forecasting credit ratings of decarbonized firms: Comparative assessment of machine learning models," Technological Forecasting and Social Change, Elsevier, vol. 174(C).
    3. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    4. Yiheng Li & Weidong Chen, 2020. "A Comparative Performance Assessment of Ensemble Learning for Credit Scoring," Mathematics, MDPI, vol. 8(10), pages 1-19, October.
    5. Michael Bücker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2022. "Transparency, auditability, and explainability of machine learning models in credit scoring," Journal of the Operational Research Society, Taylor & Francis Journals, vol. 73(1), pages 70-90, January.
    6. Crone, Sven F. & Finlay, Steven, 2012. "Instance sampling in credit scoring: An empirical study of sample size and balancing," International Journal of Forecasting, Elsevier, vol. 28(1), pages 224-238.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Babaei, Golnoosh & Giudici, Paolo & Raffinetti, Emanuela, 2023. "Explainable FinTech lending," Journal of Economics and Business, Elsevier, vol. 125.
    2. Chen, Yujia & Calabrese, Raffaella & Martin-Barragan, Belen, 2024. "Interpretable machine learning for imbalanced credit scoring datasets," European Journal of Operational Research, Elsevier, vol. 312(1), pages 357-372.
    3. Chen, Dangxing & Ye, Jiahui & Ye, Weicheng, 2023. "Interpretable selective learning in credit risk," Research in International Business and Finance, Elsevier, vol. 65(C).
    4. Gero Szepannek, 2022. "An Overview on the Landscape of R Packages for Open Source Scorecard Modelling," Risks, MDPI, vol. 10(3), pages 1-33, March.
    5. Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
    6. Liu, Guangqiang & Zeng, Qing & Lei, Juan, 2022. "Dynamic risks from climate policy uncertainty: A case study for the natural gas market," Resources Policy, Elsevier, vol. 79(C).
    7. Lang, Qiaoqi & Ma, Feng & Mirza, Nawazish & Umar, Muhammad, 2023. "The interaction of climate risk and bank liquidity: An emerging market perspective for transitions to low carbon energy," Technological Forecasting and Social Change, Elsevier, vol. 191(C).
    8. Dangxing Chen & Luyao Zhang, 2023. "Monotonicity for AI ethics and society: An empirical study of the monotonic neural additive model in criminology, education, health care, and finance," Papers 2301.07060, arXiv.org.
    9. Serrano-Cinca, Carlos & Gutiérrez-Nieto, Begoña & Bernate-Valbuena, Martha, 2019. "The use of accounting anomalies indicators to predict business failure," European Management Journal, Elsevier, vol. 37(3), pages 353-375.
    10. Sun, Weixin & Zhang, Xuantao & Li, Minghao & Wang, Yong, 2023. "Interpretable high-stakes decision support system for credit default forecasting," Technological Forecasting and Social Change, Elsevier, vol. 196(C).
    11. Wang, Zhongbao & Razzaq, Asif, 2022. "Natural resources, energy efficiency transition and sustainable development: Evidence from BRICS economies," Resources Policy, Elsevier, vol. 79(C).
    12. Al-Amin Abba Dabo & Amin Hosseinian-Far, 2023. "An Integrated Methodology for Enhancing Reverse Logistics Flows and Networks in Industry 5.0," Logistics, MDPI, vol. 7(4), pages 1-26, December.
    13. Rasa Kanapickiene & Renatas Spicas, 2019. "Credit Risk Assessment Model for Small and Micro-Enterprises: The Case of Lithuania," Risks, MDPI, vol. 7(2), pages 1-23, June.
    14. Zhang, Lifeng & Chao, Xiangrui & Qian, Qian & Jing, Fuying, 2022. "Credit evaluation solutions for social groups with poor services in financial inclusion: A technical forecasting method," Technological Forecasting and Social Change, Elsevier, vol. 183(C).
    15. Casado Yusta, Silvia & Nœ–ez Letamendía, Laura & Pacheco Bonrostro, Joaqu’n Antonio, 2018. "Predicting Corporate Failure: The GRASP-LOGIT Model || Predicci—n de la quiebra empresarial: el modelo GRASP-LOGIT," Revista de Métodos Cuantitativos para la Economía y la Empresa = Journal of Quantitative Methods for Economics and Business Administration, Universidad Pablo de Olavide, Department of Quantitative Methods for Economics and Business Administration, vol. 26(1), pages 294-314, Diciembre.
    16. Silvia Facchinetti & Paolo Giudici & Silvia Angela Osmetti, 2020. "Cyber risk measurement with ordinal data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 29(1), pages 173-185, March.
    17. Maldonado, Sebastián & Pérez, Juan & Bravo, Cristián, 2017. "Cost-based feature selection for Support Vector Machines: An application in credit scoring," European Journal of Operational Research, Elsevier, vol. 261(2), pages 656-665.
    18. Wang, Xiang & Yin, Jian & Yang, Yao & Muda, Iskandar & Abduvaxitovna, Shamansurova Zilola & AlWadi, Belal Mahmoud & Castillo-Picon, Jorge & Abdul-Samad, Zulkiflee, 2023. "Relationship between the resource curse, Forest management and sustainable development and the importance of R&D Projects," Resources Policy, Elsevier, vol. 85(PA).
    19. Calabrese, Raffaella & Osmetti, Silvia Angela, 2019. "A new approach to measure systemic risk: A bivariate copula model for dependent censored data," European Journal of Operational Research, Elsevier, vol. 279(3), pages 1053-1064.
    20. Li, Shanshan & Long, Fang & Long, Litao, 2022. "Resources curse and sustainable development revisited: Evaluating the role of remittances for China," Resources Policy, Elsevier, vol. 79(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:jforec:v:43:y:2024:i:2:p:286-308. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www3.interscience.wiley.com/cgi-bin/jhome/2966 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.