IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i21p3423-d1511857.html
   My bibliography  Save this article

Ensemble-Based Machine Learning Algorithm for Loan Default Risk Prediction

Author

Listed:
  • Abisola Akinjole

    (School of Computing and Digital Technologies, Sheffield Hallam University, Sheffield S1 2NU, UK)

  • Olamilekan Shobayo

    (School of Computing and Digital Technologies, Sheffield Hallam University, Sheffield S1 2NU, UK)

  • Jumoke Popoola

    (School of Computing and Digital Technologies, Sheffield Hallam University, Sheffield S1 2NU, UK)

  • Obinna Okoyeigbo

    (Department of Engineering, Edge Hill University, Ormskirk L39 4QP, UK)

  • Bayode Ogunleye

    (Department of Computing & Mathematics, University of Brighton, Brighton BN2 4GJ, UK)

Abstract

Predicting credit default risk is important to financial institutions, as accurately predicting the likelihood of a borrower defaulting on their loans will help to reduce financial losses, thereby maintaining profitability and stability. Although machine learning models have been used in assessing large applications with complex attributes for these predictions, there is still a need to identify the most effective techniques for the model development process, including the technique to address the issue of data imbalance. In this research, we conducted a comparative analysis of random forest, decision tree, SVMs (Support Vector Machines), XGBoost (Extreme Gradient Boosting), ADABoost (Adaptive Boosting) and the multi-layered perceptron, to predict credit defaults using loan data from LendingClub. Additionally, XGBoost was used as a framework for testing and evaluating various techniques. Moreover, we applied this XGBoost framework to handle the issue of class imbalance observed, by testing various resampling methods such as Random Over-Sampling (ROS), the Synthetic Minority Over-Sampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), Random Under-Sampling (RUS), and hybrid approaches like the SMOTE with Tomek Links and the SMOTE with Edited Nearest Neighbours (SMOTE + ENNs). The results showed that balanced datasets significantly outperformed the imbalanced dataset, with the SMOTE + ENNs delivering the best overall performance, achieving an accuracy of 90.49%, a precision of 94.61% and a recall of 92.02%. Furthermore, ensemble methods such as voting and stacking were employed to enhance performance further. Our proposed model achieved an accuracy of 93.7%, a precision of 95.6% and a recall of 95.5%, which shows the potential of ensemble methods in improving credit default predictions and can provide lending platforms with the tool to reduce default rates and financial losses. In conclusion, the findings from this study have broader implications for financial institutions, offering a robust approach to risk assessment beyond the LendingClub dataset.

Suggested Citation

  • Abisola Akinjole & Olamilekan Shobayo & Jumoke Popoola & Obinna Okoyeigbo & Bayode Ogunleye, 2024. "Ensemble-Based Machine Learning Algorithm for Loan Default Risk Prediction," Mathematics, MDPI, vol. 12(21), pages 1-32, October.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:21:p:3423-:d:1511857
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/21/3423/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/21/3423/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Duffie, Darrell, 2011. "Measuring Corporate Default Risk," OUP Catalogue, Oxford University Press, number 9780199279234, Decembrie.
    2. Dimitris Rizopoulos, 2018. "Max Kuhn and Kjell Johnson. Applied Predictive Modeling. New York, Springer," Biometrics, The International Biometric Society, vol. 74(1), pages 383-383, March.
    3. Fahmida E. Moula & Chi Guotai & Mohammad Zoynul Abedin, 2017. "Credit default prediction modeling: an application of support vector machine," Risk Management, Palgrave Macmillan, vol. 19(2), pages 158-187, May.
    4. Thomas, Lyn C., 2000. "A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers," International Journal of Forecasting, Elsevier, vol. 16(2), pages 149-172.
    5. Markus K. Brunnermeier, 2009. "Deciphering the Liquidity and Credit Crunch 2007-2008," Journal of Economic Perspectives, American Economic Association, vol. 23(1), pages 77-100, Winter.
    6. Ivashina, Victoria & Scharfstein, David, 2010. "Bank lending during the financial crisis of 2008," Journal of Financial Economics, Elsevier, vol. 97(3), pages 319-338, September.
    7. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Efraim Benmelech & Ralf R. Meisenzahl & Rodney Ramcharan, 2017. "The Real Effects of Liquidity During the Financial Crisis: Evidence from Automobiles," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 132(1), pages 317-365.
    2. Matthew Rognlie & Andrei Shleifer & Alp Simsek, 2018. "Investment Hangover and the Great Recession," American Economic Journal: Macroeconomics, American Economic Association, vol. 10(2), pages 113-153, April.
    3. Ippolito, Filippo & Peydró, José-Luis & Polo, Andrea & Sette, Enrico, 2016. "Double bank runs and liquidity risk management," Journal of Financial Economics, Elsevier, vol. 122(1), pages 135-154.
    4. Antoniades, Adonis, 2015. "Commercial bank failures during the Great Recession: the real (estate) story," Working Paper Series 1779, European Central Bank.
    5. Markus Behn & Rainer Haselmann & Paul Wachtel, 2016. "Procyclical Capital Regulation and Lending," Journal of Finance, American Finance Association, vol. 71(2), pages 919-956, April.
    6. Jose M. Berrospide, 2013. "Bank liquidity hoarding and the financial crisis: an empirical evaluation," Finance and Economics Discussion Series 2013-03, Board of Governors of the Federal Reserve System (U.S.).
    7. Felipe Restrepo & Lina Cardona‐Sosa & Philip E. Strahan, 2019. "Funding Liquidity without Banks: Evidence from a Shock to the Cost of Very Short‐Term Debt," Journal of Finance, American Finance Association, vol. 74(6), pages 2875-2914, December.
    8. Andrei Shleifer & Robert Vishny, 2011. "Fire Sales in Finance and Macroeconomics," Journal of Economic Perspectives, American Economic Association, vol. 25(1), pages 29-48, Winter.
    9. Waters, James, 2014. "Introduction of innovations during the 2007-8 financial crisis: US companies compared with universities," MPRA Paper 59016, University Library of Munich, Germany.
    10. Iyer, Rajkamal & Da-Rocha-Lopes, Samuel & Peydró, José-Luis & Schoar, Antoinette, 2014. "Interbank Liquidity Crunch and the Firm Credit Crunch: Evidence from the 2007-2009 Crisis," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 27(1), pages 347-372.
    11. Hasan, Iftekhar & Politsidis, Panagiotis N. & Sharma, Zenu, 2021. "Global syndicated lending during the COVID-19 pandemic," Journal of Banking & Finance, Elsevier, vol. 133(C).
    12. Eisenbach, Thomas M., 2017. "Rollover risk as market discipline: A two-sided inefficiency," Journal of Financial Economics, Elsevier, vol. 126(2), pages 252-269.
    13. Burkhard Raunig & Johann Scharler & Friedrich Sindermann, 2017. "Do Banks Lend Less in Uncertain Times?," Economica, London School of Economics and Political Science, vol. 84(336), pages 682-711, October.
    14. Ritz, Robert A. & Walther, Ansgar, 2015. "How do banks respond to increased funding uncertainty?," Journal of Financial Intermediation, Elsevier, vol. 24(3), pages 386-410.
    15. Lim , Jamus Jerome & Minne, Geoffrey, 2014. "Learning from financial crises," Policy Research Working Paper Series 6838, The World Bank.
    16. Fatih Tuluk, 2019. "Shadow Banking, Capital Requirements and Monetary Policy," Working Papers 2019.05, International Network for Economic Research - INFER.
    17. de Ridder, Maarten, 2016. "Investment in productivity and the long-run effect of financial crises on output," LSE Research Online Documents on Economics 86180, London School of Economics and Political Science, LSE Library.
    18. Kristian Blickle & Markus Brunnermeier & Stephan Luck, 2024. "Who Can Tell Which Banks Will Fail?," The Review of Financial Studies, Society for Financial Studies, vol. 37(9), pages 2685-2731.
    19. Drobetz, Wolfgang & Haller, Rebekka & Meier, Iwan & Tarhan, Vefa, 2017. "The impact of liquidity crises on cash flow sensitivities," The Quarterly Review of Economics and Finance, Elsevier, vol. 66(C), pages 225-239.
    20. Abankwa, Samuel & Blenman, Lloyd P., 2021. "Measuring liquidity risk effects on carry trades across currencies and regimes," Journal of Multinational Financial Management, Elsevier, vol. 60(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:21:p:3423-:d:1511857. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.