IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2104.06735.html
   My bibliography  Save this paper

Enabling Machine Learning Algorithms for Credit Scoring -- Explainable Artificial Intelligence (XAI) methods for clear understanding complex predictive models

Author

Listed:
  • Przemys{l}aw Biecek
  • Marcin Chlebus
  • Janusz Gajda
  • Alicja Gosiewska
  • Anna Kozak
  • Dominik Ogonowski
  • Jakub Sztachelski
  • Piotr Wojewnik

Abstract

Rapid development of advanced modelling techniques gives an opportunity to develop tools that are more and more accurate. However as usually, everything comes with a price and in this case, the price to pay is to loose interpretability of a model while gaining on its accuracy and precision. For managers to control and effectively manage credit risk and for regulators to be convinced with model quality the price to pay is too high. In this paper, we show how to take credit scoring analytics in to the next level, namely we present comparison of various predictive models (logistic regression, logistic regression with weight of evidence transformations and modern artificial intelligence algorithms) and show that advanced tree based models give best results in prediction of client default. What is even more important and valuable we also show how to boost advanced models using techniques which allow to interpret them and made them more accessible for credit risk practitioners, resolving the crucial obstacle in widespread deployment of more complex, 'black box' models like random forests, gradient boosted or extreme gradient boosted trees. All this will be shown on the large dataset obtained from the Polish Credit Bureau to which all the banks and most of the lending companies in the country do report the credit files. In this paper the data from lending companies were used. The paper then compares state of the art best practices in credit risk modelling with new advanced modern statistical tools boosted by the latest developments in the field of interpretability and explainability of artificial intelligence algorithms. We believe that this is a valuable contribution when it comes to presentation of different modelling tools but what is even more important it is showing which methods might be used to get insight and understanding of AI methods in credit risk context.

Suggested Citation

  • Przemys{l}aw Biecek & Marcin Chlebus & Janusz Gajda & Alicja Gosiewska & Anna Kozak & Dominik Ogonowski & Jakub Sztachelski & Piotr Wojewnik, 2021. "Enabling Machine Learning Algorithms for Credit Scoring -- Explainable Artificial Intelligence (XAI) methods for clear understanding complex predictive models," Papers 2104.06735, arXiv.org.
  • Handle: RePEc:arx:papers:2104.06735
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2104.06735
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Malhotra, Rashmi & Malhotra, D. K., 2002. "Differentiating between good credits and bad credits using neuro-fuzzy systems," European Journal of Operational Research, Elsevier, vol. 136(1), pages 190-211, January.
    2. John Y. Campbell & Jens Hilscher & Jan Szilagyi, 2008. "In Search of Distress Risk," Journal of Finance, American Finance Association, vol. 63(6), pages 2899-2939, December.
    3. Martens, David & Baesens, Bart & Van Gestel, Tony & Vanthienen, Jan, 2007. "Comprehensible credit scoring models using rule extraction from support vector machines," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1466-1476, December.
    4. D. J. Hand & W. E. Henley, 1997. "Statistical Classification Methods in Consumer Credit Scoring: a Review," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 160(3), pages 523-541, September.
    5. David Durand, 1941. "Risk Elements in Consumer Instalment Financing," NBER Books, National Bureau of Economic Research, Inc, number dura41-1, July.
    6. Martin Leo & Suneel Sharma & K. Maddulety, 2019. "Machine Learning in Banking Risk Management: A Literature Review," Risks, MDPI, vol. 7(1), pages 1-22, March.
    7. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    8. David Durand, 1941. "Risk Elements in Consumer Instalment Financing, Technical Edition," NBER Books, National Bureau of Economic Research, Inc, number dura41-2, July.
    9. Frydman, Halina & Altman, Edward I & Kao, Duen-Li, 1985. "Introducing Recursive Partitioning for Financial Classification: The Case of Financial Distress," Journal of Finance, American Finance Association, vol. 40(1), pages 269-291, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yu Zhao & Huaming Du & Qing Li & Fuzhen Zhuang & Ji Liu & Gang Kou, 2022. "A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data Perspective," Papers 2211.14997, arXiv.org, revised May 2023.
    2. Emer Owens & Barry Sheehan & Martin Mullins & Martin Cunneen & Juliane Ressel & German Castignani, 2022. "Explainable Artificial Intelligence (XAI) in Insurance," Risks, MDPI, vol. 10(12), pages 1-50, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Juan Laborda & Seyong Ryoo, 2021. "Feature Selection in a Credit Scoring Model," Mathematics, MDPI, vol. 9(7), pages 1-22, March.
    2. Büşra Alma Çallı & Erman Coşkun, 2021. "A Longitudinal Systematic Review of Credit Risk Assessment and Credit Default Predictors," SAGE Open, , vol. 11(4), pages 21582440211, November.
    3. Dimitrios Nikolaidis & Michalis Doumpos, 2022. "Credit Scoring with Drift Adaptation Using Local Regions of Competence," SN Operations Research Forum, Springer, vol. 3(4), pages 1-28, December.
    4. Maria Rocha Sousa & João Gama & Elísio Brandão, 2013. "Introducing time-changing economics into credit scoring," FEP Working Papers 513, Universidade do Porto, Faculdade de Economia do Porto.
    5. Thomas Wainwright, 2011. "Elite Knowledges: Framing Risk and the Geographies of Credit," Environment and Planning A, , vol. 43(3), pages 650-665, March.
    6. Gunnarsson, Björn Rafn & vanden Broucke, Seppe & Baesens, Bart & Óskarsdóttir, María & Lemahieu, Wilfried, 2021. "Deep learning for credit scoring: Do or don’t?," European Journal of Operational Research, Elsevier, vol. 295(1), pages 292-305.
    7. Doruk Şen & Cem Çağrı Dönmez & Umman Mahir Yıldırım, 2020. "A Hybrid Bi-level Metaheuristic for Credit Scoring," Information Systems Frontiers, Springer, vol. 22(5), pages 1009-1019, October.
    8. Fernandes, Guilherme Barreto & Artes, Rinaldo, 2016. "Spatial dependence in credit risk and its improvement in credit scoring," European Journal of Operational Research, Elsevier, vol. 249(2), pages 517-524.
    9. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    10. Crook, Jonathan N. & Edelman, David B. & Thomas, Lyn C., 2007. "Recent developments in consumer credit risk assessment," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1447-1465, December.
    11. Huei-Wen Teng & Michael Lee, 2019. "Estimation Procedures of Using Five Alternative Machine Learning Methods for Predicting Credit Card Default," Review of Pacific Basin Financial Markets and Policies (RPBFMP), World Scientific Publishing Co. Pte. Ltd., vol. 22(03), pages 1-27, September.
    12. Akkoç, Soner, 2012. "An empirical comparison of conventional techniques, neural networks and the three stage hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) model for credit scoring analysis: The case of Turkish cred," European Journal of Operational Research, Elsevier, vol. 222(1), pages 168-178.
    13. Neuberg Richard & Hannah Lauren, 2017. "Loan pricing under estimation risk," Statistics & Risk Modeling, De Gruyter, vol. 34(1-2), pages 69-87, June.
    14. TOBBACK, Ellen & MARTENS, David, 2017. "Retail credit scoring using fine-grained payment data," Working Papers 2017011, University of Antwerp, Faculty of Business and Economics.
    15. Doruk Şen & Cem Çağrı Dönmez & Umman Mahir Yıldırım, 0. "A Hybrid Bi-level Metaheuristic for Credit Scoring," Information Systems Frontiers, Springer, vol. 0, pages 1-11.
    16. Fernandes, Guilherme Barreto & Artes , Rinaldo, 2013. "Spatial correlation in credit risk and its improvement in credit scoring," Insper Working Papers wpe_321, Insper Working Paper, Insper Instituto de Ensino e Pesquisa.
    17. Anjali Chopra & Priyanka Bhilare, 2018. "Application of Ensemble Models in Credit Scoring Models," Business Perspectives and Research, , vol. 6(2), pages 129-141, July.
    18. Fabián Enrique Salazar Villano, 2013. "Cuantificación del riesgo de incumplimiento en créditos de libre inversión: un ejercicio econométrico para una entidad bancaria del municipio de Popayán, Colombia," Estudios Gerenciales, Universidad Icesi, December.
    19. Puertas Medina, Rosa & Selva, Maria Luisa Martí, 2013. "Análise do credit scoring," RAE - Revista de Administração de Empresas, FGV-EAESP Escola de Administração de Empresas de São Paulo (Brazil), vol. 53(3), May.
    20. Finlay, Steven, 2011. "Multiple classifier architectures and their application to credit risk assessment," European Journal of Operational Research, Elsevier, vol. 210(2), pages 368-378, April.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2104.06735. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.