IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2108.13914.html
   My bibliography  Save this paper

Look Who's Talking: Interpretable Machine Learning for Assessing Italian SMEs Credit Default

Author

Listed:
  • Lisa Crosato
  • Caterina Liberati
  • Marco Repetto

Abstract

Academic research and the financial industry have recently paid great attention to Machine Learning algorithms due to their power to solve complex learning tasks. In the field of firms' default prediction, however, the lack of interpretability has prevented the extensive adoption of the black-box type of models. To overcome this drawback and maintain the high performances of black-boxes, this paper relies on a model-agnostic approach. Accumulated Local Effects and Shapley values are used to shape the predictors' impact on the likelihood of default and rank them according to their contribution to the model outcome. Prediction is achieved by two Machine Learning algorithms (eXtreme Gradient Boosting and FeedForward Neural Network) compared with three standard discriminant models. Results show that our analysis of the Italian Small and Medium Enterprises manufacturing industry benefits from the overall highest classification power by the eXtreme Gradient Boosting algorithm without giving up a rich interpretation framework.

Suggested Citation

  • Lisa Crosato & Caterina Liberati & Marco Repetto, 2021. "Look Who's Talking: Interpretable Machine Learning for Assessing Italian SMEs Credit Default," Papers 2108.13914, arXiv.org, revised Sep 2021.
  • Handle: RePEc:arx:papers:2108.13914
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2108.13914
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sohn, So Young & Kim, Hong Sik, 2007. "Random effects logistic regression model for default prediction of technology credit guarantee fund," European Journal of Operational Research, Elsevier, vol. 183(1), pages 472-478, November.
    2. du Jardin, Philippe, 2016. "A two-stage classification technique for bankruptcy prediction," European Journal of Operational Research, Elsevier, vol. 254(1), pages 236-252.
    3. Gong, Joonho & Kim, Hyunjoong, 2017. "RHSBoost: Improving classification performance in imbalance data," Computational Statistics & Data Analysis, Elsevier, vol. 111(C), pages 1-13.
    4. Filipe, Sara Ferreira & Grammatikos, Theoharry & Michala, Dimitra, 2016. "Forecasting distress in European SME portfolios," Journal of Banking & Finance, Elsevier, vol. 64(C), pages 112-135.
    5. Jabeur, Sami Ben & Gharib, Cheima & Mefteh-Wali, Salma & Arfi, Wissal Ben, 2021. "CatBoost model and artificial intelligence techniques for corporate failure prediction," Technological Forecasting and Social Change, Elsevier, vol. 166(C).
    6. Marianna SUCCURRO & Lidia MANNARINO, 2014. "The Impact Of Financial Structure On Firms’ Probability Of Bankruptcy: A Comparison Across Western Europe Convergence Regions," Regional and Sectoral Economic Studies, Euro-American Association of Economic Development, vol. 14(1), pages 81-94.
    7. Raffaella Calabrese & Silvia Angela Osmetti, 2013. "Modelling small and medium enterprise loan defaults as rare events: the generalized extreme value regression model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 40(6), pages 1172-1188, June.
    8. Cornille, David & Rycx, François & Tojerow, Ilan, 2019. "Heterogeneous effects of credit constraints on SMEs’ employment: Evidence from the European sovereign debt crisis," Journal of Financial Stability, Elsevier, vol. 41(C), pages 1-13.
    9. Psillaki, Maria & Tsolas, Ioannis E. & Margaritis, Dimitris, 2010. "Evaluation of credit risk based on firm performance," European Journal of Operational Research, Elsevier, vol. 201(3), pages 873-881, March.
    10. Mai, Feng & Tian, Shaonan & Lee, Chihoon & Ma, Ling, 2019. "Deep learning models for bankruptcy prediction using textual disclosures," European Journal of Operational Research, Elsevier, vol. 274(2), pages 743-758.
    11. Bellotti, Anthony & Brigo, Damiano & Gambetti, Paolo & Vrins, Frédéric, 2021. "Forecasting recovery rates on non-performing loans with machine learning," International Journal of Forecasting, Elsevier, vol. 37(1), pages 428-444.
    12. Anastasios Petropoulos & Vasilis Siakoulis & Evaggelos Stavroulakis & Aristotelis Klamargias, 2019. "A robust machine learning approach for credit risk analysis of large loan level datasets using deep learning and extreme gradient boosting," IFC Bulletins chapters, in: Bank for International Settlements (ed.), Are post-crisis statistical initiatives completed?, volume 49, Bank for International Settlements.
    13. M. Modina & F. Pietrovito, 2014. "A default prediction model for Italian SMEs: the relevance of the capital structure," Applied Financial Economics, Taylor & Francis Journals, vol. 24(23), pages 1537-1554, December.
    14. Manuel D. N. T. Oliveira & Fernando A. F. Ferreira & Guillermo O. Pérez-Bustamante Ilander & Marjan S. Jalali, 2017. "Integrating cognitive mapping and MCDA for bankruptcy prediction in small- and medium-sized enterprises," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 68(9), pages 985-997, September.
    15. Ahelegbey, Daniel Felix & Giudici, Paolo & Hadji-Misheva, Branka, 2019. "Latent factor models for credit scoring in P2P systems," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 522(C), pages 112-121.
    16. B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.
    17. Anastasios Petropoulos & Vasilis Siakoulis & Evaggelos Stavroulakis & Aristotelis Klamargias, 2019. "A robust machine learning approach for credit risk analysis of large loan-level datasets using deep learning and extreme gradient boosting," IFC Bulletins chapters, in: Bank for International Settlements (ed.), The use of big data analytics and artificial intelligence in central banking, volume 50, Bank for International Settlements.
    18. Bart Baesens & Sebastiaan Höppner & Irene Ortner & Tim Verdonck, 2021. "robROSE: A robust approach for dealing with imbalanced data in fraud detection," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(3), pages 841-861, September.
    19. S-M Lin & J Ansell & G Andreeva, 2012. "Predicting default of a small business using different definitions of financial distress," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 63(4), pages 539-548, April.
    20. Nehrebecka Natalia, 2018. "Predicting the Default Risk of Companies. Comparison of Credit Scoring Models: Logit Vs Support Vector Machines," Econometrics. Advances in Applied Data Analysis, Sciendo, vol. 22(2), pages 54-73, June.
    21. Geng, Ruibin & Bose, Indranil & Chen, Xi, 2015. "Prediction of financial distress: An empirical study of listed Chinese companies using data mining," European Journal of Operational Research, Elsevier, vol. 241(1), pages 236-247.
    22. Andreeva, Galina & Calabrese, Raffaella & Osmetti, Silvia Angela, 2016. "A comparative analysis of the UK and Italian small businesses using Generalised Extreme Value models," European Journal of Operational Research, Elsevier, vol. 249(2), pages 506-516.
    23. Daniel W. Apley & Jingyu Zhu, 2020. "Visualizing the effects of predictor variables in black box supervised learning models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(4), pages 1059-1086, September.
    24. Dimitra Michala & Theoharry Grammatikos & Sara Ferreira Filipe, 2013. "Forecasting distress in European SME portfolios," LSF Research Working Paper Series 13-2, Luxembourg School of Finance, University of Luxembourg.
    25. Mirko Moscatelli & Simone Narizzano & Fabio Parlapiano & Gianluca Viggiano, 2019. "Corporate default forecasting with machine learning," Temi di discussione (Economic working papers) 1256, Bank of Italy, Economic Research and International Relations Area.
    26. Marco Bellandi & Silvia Lombardi & Erica Santini, 2020. "Traditional manufacturing areas and the emergence of product-service systems: the case of Italy," Economia e Politica Industriale: Journal of Industrial and Business Economics, Springer;Associazione Amici di Economia e Politica Industriale, vol. 47(2), pages 311-331, June.
    27. Andrés Alonso & José Manuel Carbó, 2020. "Machine learning in credit risk: measuring the dilemma between prediction and supervisory cost," Working Papers 2032, Banco de España.
    28. Alin Marius Andrieș & Nicu Marcu & Florin Oprea & Mihaela Tofan, 2018. "Financial Infrastructure and Access to Finance for European SMEs," Sustainability, MDPI, vol. 10(10), pages 1-15, September.
    29. Ciampi, Francesco, 2015. "Corporate governance characteristics and default prediction modeling for small enterprises. An empirical analysis of Italian firms," Journal of Business Research, Elsevier, vol. 68(5), pages 1012-1025.
    30. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    31. Francesco Ciampi & Alessandro Giannozzi & Giacomo Marzi & Edward I. Altman, 2021. "Rethinking SME default prediction: a systematic literature review and future perspectives," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(3), pages 2141-2188, March.
    32. Kim, Hong Sik & Sohn, So Young, 2010. "Support vector machines for default prediction of SMEs based on technology credit," European Journal of Operational Research, Elsevier, vol. 201(3), pages 838-846, March.
    33. Raffaella Calabrese & Giampiero Marra & Silvia Angela Osmetti, 2016. "Bankruptcy prediction of small and medium enterprises using a flexible binary generalized extreme value model," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 67(4), pages 604-615, April.
    34. Stevenson, Matthew & Mues, Christophe & Bravo, Cristián, 2021. "The value of text for small business default prediction: A Deep Learning approach," European Journal of Operational Research, Elsevier, vol. 295(2), pages 758-771.
    35. Niklas Bussmann & Paolo Giudici & Dimitri Marinelli & Jochen Papenbrock, 2021. "Explainable Machine Learning in Credit Risk Management," Computational Economics, Springer;Society for Computational Economics, vol. 57(1), pages 203-216, January.
    36. Dean Fantazzini & Silvia Figini, 2009. "Random Survival Forests Models for SME Credit Risk Measurement," Methodology and Computing in Applied Probability, Springer, vol. 11(1), pages 29-45, March.
    37. Sigrist, Fabio & Hirnschall, Christoph, 2019. "Grabit: Gradient tree-boosted Tobit models for default prediction," Journal of Banking & Finance, Elsevier, vol. 102(C), pages 177-192.
    38. Edward I. Altman & Gabriele Sabato, 2013. "MODELING CREDIT RISK FOR SMEs: EVIDENCE FROM THE US MARKET," World Scientific Book Chapters, in: Oliviero Roggi & Edward I Altman (ed.), Managing and Measuring Risk Emerging Global Standards and Regulations After the Financial Crisis, chapter 9, pages 251-279, World Scientific Publishing Co. Pte. Ltd..
    39. Dimitra Michala & Theoharry Grammatikos & Sara Ferreira Filipe, 2013. "Forecasting distress in European SME portfolios," DEM Discussion Paper Series 13-2, Department of Economics at the University of Luxembourg.
    40. Jairaj Gupta & Andros Gregoriou & Tahera Ebrahimi, 2018. "Empirical comparison of hazard models in predicting SMEs failure," Quantitative Finance, Taylor & Francis Journals, vol. 18(3), pages 437-466, March.
    41. Jones, Stewart & Johnstone, David & Wilson, Roy, 2015. "An empirical evaluation of the performance of binary classifiers in the prediction of credit ratings changes," Journal of Banking & Finance, Elsevier, vol. 56(C), pages 72-85.
    42. Lang Zhang & Haiqing Hu & Dan Zhang, 2015. "A credit risk assessment model based on SVM for small and medium enterprises in supply chain finance," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 1(1), pages 1-21, December.
    43. El Kalak, Izidin & Hudson, Robert, 2016. "The effect of size on the failure probabilities of SMEs: An empirical study on the US market using discrete hazard model," International Review of Financial Analysis, Elsevier, vol. 43(C), pages 135-145.
    44. Jones, Stewart & Wang, Tim, 2019. "Predicting private company failure: A multi-class analysis," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 61(C), pages 161-188.
    45. Jairaj Gupta & Andros Gregoriou & Jerome Healy, 2015. "Forecasting bankruptcy for SMEs using hazard function: To what extent does size matter?," Review of Quantitative Finance and Accounting, Springer, vol. 45(4), pages 845-869, November.
    46. P. Holmes & A. Hunt & I. Stone, 2010. "An analysis of new firm survival using a hazard function," Applied Economics, Taylor & Francis Journals, vol. 42(2), pages 185-195.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Francesco Ciampi & Alessandro Giannozzi & Giacomo Marzi & Edward I. Altman, 2021. "Rethinking SME default prediction: a systematic literature review and future perspectives," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(3), pages 2141-2188, March.
    2. Carmen Gallucci & Rosalia Santullli & Michele Modina & Vincenzo Formisano, 2023. "Financial ratios, corporate governance and bank-firm information: a Bayesian approach to predict SMEs’ default," Journal of Management & Governance, Springer;Accademia Italiana di Economia Aziendale (AIDEA), vol. 27(3), pages 873-892, September.
    3. Andreeva, Galina & Calabrese, Raffaella & Osmetti, Silvia Angela, 2016. "A comparative analysis of the UK and Italian small businesses using Generalised Extreme Value models," European Journal of Operational Research, Elsevier, vol. 249(2), pages 506-516.
    4. David Veganzones, 2022. "Corporate failure prediction using threshold‐based models," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(5), pages 956-979, August.
    5. Yu Zhao & Huaming Du & Qing Li & Fuzhen Zhuang & Ji Liu & Gang Kou, 2022. "A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data Perspective," Papers 2211.14997, arXiv.org, revised May 2023.
    6. Aneta Ptak-Chmielewska, 2021. "Bankruptcy prediction of small- and medium-sized enterprises in Poland based on the LDA and SVM methods," Statistics in Transition New Series, Polish Statistical Association, vol. 22(1), pages 179-195, March.
    7. Ben Jabeur, Sami & Serret, Vanessa, 2023. "Bankruptcy prediction using fuzzy convolutional neural networks," Research in International Business and Finance, Elsevier, vol. 64(C).
    8. Andrés Alonso & José Manuel Carbó, 2021. "Understanding the performance of machine learning models to predict credit default: a novel approach for supervisory evaluation," Working Papers 2105, Banco de España.
    9. Alonso-Robisco, Andrés & Carbó, José Manuel, 2022. "Can machine learning models save capital for banks? Evidence from a Spanish credit portfolio," International Review of Financial Analysis, Elsevier, vol. 84(C).
    10. Chen, Yujia & Calabrese, Raffaella & Martin-Barragan, Belen, 2024. "Interpretable machine learning for imbalanced credit scoring datasets," European Journal of Operational Research, Elsevier, vol. 312(1), pages 357-372.
    11. Aneta Ptak-Chmielewska, 2019. "Predicting Micro-Enterprise Failures Using Data Mining Techniques," JRFM, MDPI, vol. 12(1), pages 1-17, February.
    12. Ptak-Chmielewska Aneta, 2021. "Bankruptcy prediction of small- and medium-sized enterprises in Poland based on the LDA and SVM methods," Statistics in Transition New Series, Polish Statistical Association, vol. 22(1), pages 179-195, March.
    13. Jabeur, Sami Ben & Gharib, Cheima & Mefteh-Wali, Salma & Arfi, Wissal Ben, 2021. "CatBoost model and artificial intelligence techniques for corporate failure prediction," Technological Forecasting and Social Change, Elsevier, vol. 166(C).
    14. Modina, Michele & Pietrovito, Filomena & Gallucci, Carmen & Formisano, Vincenzo, 2023. "Predicting SMEs’ default risk: Evidence from bank-firm relationship data," The Quarterly Review of Economics and Finance, Elsevier, vol. 89(C), pages 254-268.
    15. Sigrist, Fabio & Leuenberger, Nicola, 2023. "Machine learning for corporate default risk: Multi-period prediction, frailty correlation, loan portfolios, and tail probabilities," European Journal of Operational Research, Elsevier, vol. 305(3), pages 1390-1406.
    16. Tingqiang Chen & Suyang Wang, 2023. "Incomplete information model of credit default of micro and small enterprises," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 28(3), pages 2956-2974, July.
    17. Raffaella Calabrese & Galina Andreeva & Jake Ansell, 2019. "“Birds of a Feather” Fail Together: Exploring the Nature of Dependency in SME Defaults," Risk Analysis, John Wiley & Sons, vol. 39(1), pages 71-84, January.
    18. Andrés Alonso Robisco & José Manuel Carbó Martínez, 2022. "Measuring the model risk-adjusted performance of machine learning algorithms in credit default prediction," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-35, December.
    19. Bátiz-Zuk Enrique & Mohamed Abdulkadir & Sánchez-Cajal Fátima, 2021. "Exploring the sources of loan default clustering using survival analysis with frailty," Working Papers 2021-14, Banco de México.
    20. Calabrese, Raffaella, 2023. "Contagion effects of UK small business failures: A spatial hierarchical autoregressive model for binary data," European Journal of Operational Research, Elsevier, vol. 305(2), pages 989-997.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2108.13914. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.