IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v295y2021i2p758-771.html
   My bibliography  Save this article

The value of text for small business default prediction: A Deep Learning approach

Author

Listed:
  • Stevenson, Matthew
  • Mues, Christophe
  • Bravo, Cristián

Abstract

Compared to consumer lending, Micro, Small and Medium Enterprise (mSME) credit risk modelling is particularly challenging, as, often, the same sources of information are not available. Therefore, it is standard policy for a loan officer to provide a textual loan assessment to mitigate limited data availability. In turn, this statement is analysed by a credit expert alongside any available standard credit data. In our paper, we exploit recent advances from the field of Deep Learning and Natural Language Processing (NLP), including the BERT (Bidirectional Encoder Representations from Transformers) model, to extract information from 60,000 textual assessments provided by a lender. We consider the performance in terms of the AUC (Area Under the receiver operating characteristic Curve) and Brier Score metrics and find that the text alone is surprisingly effective for predicting default. However, when combined with traditional data, it yields no additional predictive capability, with performance dependent on the text’s length. Our proposed Deep Learning model does, however, appear to be robust to the quality of the text and therefore suitable for partly automating the mSME lending process. We also demonstrate how the content of loan assessments influences performance, leading us to a series of recommendations on a new strategy for collecting future mSME loan assessments.

Suggested Citation

  • Stevenson, Matthew & Mues, Christophe & Bravo, Cristián, 2021. "The value of text for small business default prediction: A Deep Learning approach," European Journal of Operational Research, Elsevier, vol. 295(2), pages 758-771.
  • Handle: RePEc:eee:ejores:v:295:y:2021:i:2:p:758-771
    DOI: 10.1016/j.ejor.2021.03.008
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221721001983
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2021.03.008?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Bravo, Cristián & Maldonado, Sebastián & Weber, Richard, 2013. "Granting and managing loans for micro-entrepreneurs: New developments and practical experiences," European Journal of Operational Research, Elsevier, vol. 227(2), pages 358-366.
    2. B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.
    3. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    4. Dorfleitner, Gregor & Priberny, Christopher & Schuster, Stephanie & Stoiber, Johannes & Weber, Martina & de Castro, Ivan & Kammler, Julia, 2016. "Description-text related soft information in peer-to-peer lending – Evidence from two leading European platforms," Journal of Banking & Finance, Elsevier, vol. 64(C), pages 169-187.
    5. Agarwal, Arvind & Gupta, Aparna & Kumar, Arun & Tamilselvam, Srikanth G., 2019. "Learning risk culture of banks using news analytics," European Journal of Operational Research, Elsevier, vol. 277(2), pages 770-783.
    6. Lee, Neil & Sameen, Hiba & Cowling, Marc, 2015. "Access to finance for innovative SMEs since the financial crisis," Research Policy, Elsevier, vol. 44(2), pages 370-380.
    7. Alexander W. Bartik & Marianne Bertrand & Zoë B. Cullen & Edward L. Glaeser & Michael Luca & Christopher T. Stanton, 2020. "How Are Small Businesses Adjusting to COVID-19? Early Evidence from a Survey," NBER Working Papers 26989, National Bureau of Economic Research, Inc.
    8. Fischer, Thomas & Krauss, Christopher, 2018. "Deep learning with long short-term memory networks for financial market predictions," European Journal of Operational Research, Elsevier, vol. 270(2), pages 654-669.
    9. Raffaella Calabrese & Silvia Angela Osmetti, 2013. "Modelling small and medium enterprise loan defaults as rare events: the generalized extreme value regression model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 40(6), pages 1172-1188, June.
    10. Mai, Feng & Tian, Shaonan & Lee, Chihoon & Ma, Ling, 2019. "Deep learning models for bankruptcy prediction using textual disclosures," European Journal of Operational Research, Elsevier, vol. 274(2), pages 743-758.
    11. Cuiqing Jiang & Zhao Wang & Ruiya Wang & Yong Ding, 2018. "Loan default prediction by combining soft information extracted from descriptive text in online peer-to-peer lending," Annals of Operations Research, Springer, vol. 266(1), pages 511-529, July.
    12. Chen, Xiao & Huang, Bihong & Ye, Dezhu, 2018. "The role of punctuation in P2P lending: Evidence from China," Economic Modelling, Elsevier, vol. 68(C), pages 634-643.
    13. Zhang, Chaowei & Gupta, Ashish & Kauten, Christian & Deokar, Amit V. & Qin, Xiao, 2019. "Detecting fake news for reducing misinformation risks using analytics approaches," European Journal of Operational Research, Elsevier, vol. 279(3), pages 1036-1052.
    14. Joris Van Gool & Wouter Verbeke & Piet Sercu & Bart Baesens, 2012. "Credit scoring for microfinance: is it worth it?," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 17(2), pages 103-123, April.
    15. Tsai, Ming-Feng & Wang, Chuan-Ju, 2017. "On the risk prediction and analysis of soft information in finance reports," European Journal of Operational Research, Elsevier, vol. 257(1), pages 243-250.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Koen W. de Bock & Kristof Coussement & Arno De Caigny & Roman Slowiński & Bart Baesens & Robert N Boute & Tsan-Ming Choi & Dursun Delen & Mathias Kraus & Stefan Lessmann & Sebastián Maldonado & David , 2023. "Explainable AI for Operational Research: A Defining Framework, Methods, Applications, and a Research Agenda," Post-Print hal-04219546, HAL.
    2. Kriebel, Johannes & Stitz, Lennart, 2022. "Credit default prediction from user-generated text in peer-to-peer lending using deep learning," European Journal of Operational Research, Elsevier, vol. 302(1), pages 309-323.
    3. Wei Li & Florentina Paraschiv & Georgios Sermpinis, 2021. "A Data-driven Explainable Case-based Reasoning Approach for Financial Risk Detection," Papers 2107.08808, arXiv.org.
    4. Christopher Gerling & Stefan Lessmann, 2023. "Multimodal Document Analytics for Banking Process Automation," Papers 2307.11845, arXiv.org, revised Nov 2023.
    5. Lisa Crosato & Caterina Liberati & Marco Repetto, 2021. "Look Who's Talking: Interpretable Machine Learning for Assessing Italian SMEs Credit Default," Papers 2108.13914, arXiv.org, revised Sep 2021.
    6. Korangi, Kamesh & Mues, Christophe & Bravo, Cristián, 2023. "A transformer-based model for default prediction in mid-cap corporate markets," European Journal of Operational Research, Elsevier, vol. 308(1), pages 306-320.
    7. Goodell, John W. & Ben Jabeur, Sami & Saâdaoui, Foued & Nasir, Muhammad Ali, 2023. "Explainable artificial intelligence modeling to forecast bitcoin prices," International Review of Financial Analysis, Elsevier, vol. 88(C).
    8. Kamesh Korangi & Christophe Mues & Cristi'an Bravo, 2021. "A transformer-based model for default prediction in mid-cap corporate markets," Papers 2111.09902, arXiv.org, revised Apr 2023.
    9. Mahsa Tavakoli & Rohitash Chandra & Fengrui Tian & Cristi'an Bravo, 2023. "Multi-Modal Deep Learning for Credit Rating Prediction Using Text and Numerical Data Streams," Papers 2304.10740, arXiv.org, revised Sep 2023.
    10. Katsafados, Apostolos G. & Leledakis, George N. & Pyrgiotakis, Emmanouil G. & Androutsopoulos, Ion & Fergadiotis, Manos, 2024. "Machine learning in bank merger prediction: A text-based approach," European Journal of Operational Research, Elsevier, vol. 312(2), pages 783-797.
    11. Vairetti, Carla & Aránguiz, Ignacio & Maldonado, Sebastián & Karmy, Juan Pablo & Leal, Alonso, 2024. "Analytics-driven complaint prioritisation via deep learning and multicriteria decision-making," European Journal of Operational Research, Elsevier, vol. 312(3), pages 1108-1118.
    12. Mario Sanz-Guerrero & Javier Arroyo, 2024. "Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending," Papers 2401.16458, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kriebel, Johannes & Stitz, Lennart, 2022. "Credit default prediction from user-generated text in peer-to-peer lending using deep learning," European Journal of Operational Research, Elsevier, vol. 302(1), pages 309-323.
    2. Jiang, Cuiqing & Lyu, Ximei & Yuan, Yufei & Wang, Zhao & Ding, Yong, 2022. "Mining semantic features in current reports for financial distress prediction: Empirical evidence from unlisted public firms in China," International Journal of Forecasting, Elsevier, vol. 38(3), pages 1086-1099.
    3. Yufei Xia & Lingyun He & Yinguo Li & Nana Liu & Yanlin Ding, 2020. "Predicting loan default in peer‐to‐peer lending using narrative data," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(2), pages 260-280, March.
    4. Medina-Olivares, Victor & Calabrese, Raffaella & Dong, Yizhe & Shi, Baofeng, 2022. "Spatial dependence in microfinance credit default," International Journal of Forecasting, Elsevier, vol. 38(3), pages 1071-1085.
    5. Lisa Crosato & Caterina Liberati & Marco Repetto, 2021. "Look Who's Talking: Interpretable Machine Learning for Assessing Italian SMEs Credit Default," Papers 2108.13914, arXiv.org, revised Sep 2021.
    6. Wang, Chao & Wang, Junbo & Wu, Chunchi & Zhang, Yue, 2023. "Voluntary disclosure in P2P lending: Information or hyperbole?," Pacific-Basin Finance Journal, Elsevier, vol. 79(C).
    7. Wang, Chao & Zhang, Yue & Zhang, Weiguo & Gong, Xue, 2021. "Textual sentiment of comments and collapse of P2P platforms: Evidence from China's P2P market," Research in International Business and Finance, Elsevier, vol. 58(C).
    8. Christopher Gerling & Stefan Lessmann, 2023. "Multimodal Document Analytics for Banking Process Automation," Papers 2307.11845, arXiv.org, revised Nov 2023.
    9. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    10. Wu, Yu & Zhang, Tong, 2021. "Can credit ratings predict defaults in peer-to-peer online lending? Evidence from a Chinese platform," Finance Research Letters, Elsevier, vol. 40(C).
    11. Sarbjit Singh Oberoi & Sayan Banerjee, 2023. "Bankruptcy Prediction of Indian Banks Using Advanced Analytics," Economic Studies journal, Bulgarian Academy of Sciences - Economic Research Institute, issue 4, pages 22-41.
    12. Li, Zhiyong & Li, Aimin & Bellotti, Anthony & Yao, Xiao, 2023. "The profitability of online loans: A competing risks analysis on default and prepayment," European Journal of Operational Research, Elsevier, vol. 306(2), pages 968-985.
    13. Gregor Dorfleitner & Eva-Maria Oswald & Rongxin Zhang, 2021. "From Credit Risk to Social Impact: On the Funding Determinants in Interest-Free Peer-to-Peer Lending," Journal of Business Ethics, Springer, vol. 170(2), pages 375-400, May.
    14. Luisa Roa & Alejandro Correa-Bahnsen & Gabriel Suarez & Fernando Cort'es-Tejada & Mar'ia A. Luque & Cristi'an Bravo, 2020. "Super-App Behavioral Patterns in Credit Risk Models: Financial, Statistical and Regulatory Implications," Papers 2005.14658, arXiv.org, revised Jan 2021.
    15. Koen W. de Bock, 2017. "The best of two worlds: Balancing model strength and comprehensibility in business failure prediction using spline-rule ensembles," Post-Print hal-01588059, HAL.
    16. Borchert, Philipp & Coussement, Kristof & De Caigny, Arno & De Weerdt, Jochen, 2023. "Extending business failure prediction models with textual website content using deep learning," European Journal of Operational Research, Elsevier, vol. 306(1), pages 348-357.
    17. Wang, Tong & Zhao, Sheng & Zhou, Mengqiu, 2022. "Does soft information in expert ratings curb information asymmetry? Evidence from crowdfunding and early transaction phases of Initial Coin offerings," Journal of International Financial Markets, Institutions and Money, Elsevier, vol. 81(C).
    18. Gradojevic, Nikola & Kukolj, Dragan & Adcock, Robert & Djakovic, Vladimir, 2023. "Forecasting Bitcoin with technical analysis: A not-so-random forest?," International Journal of Forecasting, Elsevier, vol. 39(1), pages 1-17.
    19. Koen W. de Bock & Kristof Coussement & Arno De Caigny & Roman Slowiński & Bart Baesens & Robert N Boute & Tsan-Ming Choi & Dursun Delen & Mathias Kraus & Stefan Lessmann & Sebastián Maldonado & David , 2023. "Explainable AI for Operational Research: A Defining Framework, Methods, Applications, and a Research Agenda," Post-Print hal-04219546, HAL.
    20. Hewa-Wellalage, Nirosha & Boubaker, Sabri & Hunjra, Ahmed Imran & Verhoeven, Peter, 2022. "The gender gap in access to finance: Evidence from the COVID-19 pandemic," Finance Research Letters, Elsevier, vol. 46(PA).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:295:y:2021:i:2:p:758-771. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.