IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i21p3423-d1511857.html
   My bibliography  Save this article

Ensemble-Based Machine Learning Algorithm for Loan Default Risk Prediction

Author

Listed:
  • Abisola Akinjole

    (School of Computing and Digital Technologies, Sheffield Hallam University, Sheffield S1 2NU, UK)

  • Olamilekan Shobayo

    (School of Computing and Digital Technologies, Sheffield Hallam University, Sheffield S1 2NU, UK)

  • Jumoke Popoola

    (School of Computing and Digital Technologies, Sheffield Hallam University, Sheffield S1 2NU, UK)

  • Obinna Okoyeigbo

    (Department of Engineering, Edge Hill University, Ormskirk L39 4QP, UK)

  • Bayode Ogunleye

    (Department of Computing & Mathematics, University of Brighton, Brighton BN2 4GJ, UK)

Abstract

Predicting credit default risk is important to financial institutions, as accurately predicting the likelihood of a borrower defaulting on their loans will help to reduce financial losses, thereby maintaining profitability and stability. Although machine learning models have been used in assessing large applications with complex attributes for these predictions, there is still a need to identify the most effective techniques for the model development process, including the technique to address the issue of data imbalance. In this research, we conducted a comparative analysis of random forest, decision tree, SVMs (Support Vector Machines), XGBoost (Extreme Gradient Boosting), ADABoost (Adaptive Boosting) and the multi-layered perceptron, to predict credit defaults using loan data from LendingClub. Additionally, XGBoost was used as a framework for testing and evaluating various techniques. Moreover, we applied this XGBoost framework to handle the issue of class imbalance observed, by testing various resampling methods such as Random Over-Sampling (ROS), the Synthetic Minority Over-Sampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), Random Under-Sampling (RUS), and hybrid approaches like the SMOTE with Tomek Links and the SMOTE with Edited Nearest Neighbours (SMOTE + ENNs). The results showed that balanced datasets significantly outperformed the imbalanced dataset, with the SMOTE + ENNs delivering the best overall performance, achieving an accuracy of 90.49%, a precision of 94.61% and a recall of 92.02%. Furthermore, ensemble methods such as voting and stacking were employed to enhance performance further. Our proposed model achieved an accuracy of 93.7%, a precision of 95.6% and a recall of 95.5%, which shows the potential of ensemble methods in improving credit default predictions and can provide lending platforms with the tool to reduce default rates and financial losses. In conclusion, the findings from this study have broader implications for financial institutions, offering a robust approach to risk assessment beyond the LendingClub dataset.

Suggested Citation

  • Abisola Akinjole & Olamilekan Shobayo & Jumoke Popoola & Obinna Okoyeigbo & Bayode Ogunleye, 2024. "Ensemble-Based Machine Learning Algorithm for Loan Default Risk Prediction," Mathematics, MDPI, vol. 12(21), pages 1-32, October.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:21:p:3423-:d:1511857
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/21/3423/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/21/3423/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Markus K. Brunnermeier, 2009. "Deciphering the Liquidity and Credit Crunch 2007-2008," Journal of Economic Perspectives, American Economic Association, vol. 23(1), pages 77-100, Winter.
    2. Ivashina, Victoria & Scharfstein, David, 2010. "Bank lending during the financial crisis of 2008," Journal of Financial Economics, Elsevier, vol. 97(3), pages 319-338, September.
    3. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    4. Duffie, Darrell, 2011. "Measuring Corporate Default Risk," OUP Catalogue, Oxford University Press, number 9780199279234, Decembrie.
    5. Dimitris Rizopoulos, 2018. "Max Kuhn and Kjell Johnson. Applied Predictive Modeling. New York, Springer," Biometrics, The International Biometric Society, vol. 74(1), pages 383-383, March.
    6. Fahmida E. Moula & Chi Guotai & Mohammad Zoynul Abedin, 2017. "Credit default prediction modeling: an application of support vector machine," Risk Management, Palgrave Macmillan, vol. 19(2), pages 158-187, May.
    7. Thomas, Lyn C., 2000. "A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers," International Journal of Forecasting, Elsevier, vol. 16(2), pages 149-172.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Shijie Wang & Xueyong Zhang, 2025. "Credit Rating Model Based on Improved TabNet," Mathematics, MDPI, vol. 13(9), pages 1-30, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matthew Rognlie & Andrei Shleifer & Alp Simsek, 2018. "Investment Hangover and the Great Recession," American Economic Journal: Macroeconomics, American Economic Association, vol. 10(2), pages 113-153, April.
    2. Antoniades, Adonis, 2015. "Commercial bank failures during the Great Recession: the real (estate) story," Working Paper Series 1779, European Central Bank.
    3. Felipe Restrepo & Lina Cardona‐Sosa & Philip E. Strahan, 2019. "Funding Liquidity without Banks: Evidence from a Shock to the Cost of Very Short‐Term Debt," Journal of Finance, American Finance Association, vol. 74(6), pages 2875-2914, December.
    4. Andrei Shleifer & Robert Vishny, 2011. "Fire Sales in Finance and Macroeconomics," Journal of Economic Perspectives, American Economic Association, vol. 25(1), pages 29-48, Winter.
    5. Waters, James, 2014. "Introduction of innovations during the 2007-8 financial crisis: US companies compared with universities," MPRA Paper 59016, University Library of Munich, Germany.
    6. Iyer, Rajkamal & Da-Rocha-Lopes, Samuel & Peydró, José-Luis & Schoar, Antoinette, 2014. "Interbank Liquidity Crunch and the Firm Credit Crunch: Evidence from the 2007-2009 Crisis," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 27(1), pages 347-372.
    7. Eisenbach, Thomas M., 2017. "Rollover risk as market discipline: A two-sided inefficiency," Journal of Financial Economics, Elsevier, vol. 126(2), pages 252-269.
    8. Burkhard Raunig & Johann Scharler & Friedrich Sindermann, 2017. "Do Banks Lend Less in Uncertain Times?," Economica, London School of Economics and Political Science, vol. 84(336), pages 682-711, October.
    9. Lim , Jamus Jerome & Minne, Geoffrey, 2014. "Learning from financial crises," Policy Research Working Paper Series 6838, The World Bank.
    10. Fatih Tuluk, 2019. "Shadow Banking, Capital Requirements and Monetary Policy," Working Papers 2019.05, International Network for Economic Research - INFER.
    11. de Ridder, Maarten, 2016. "Investment in productivity and the long-run effect of financial crises on output," LSE Research Online Documents on Economics 86180, London School of Economics and Political Science, LSE Library.
    12. Choi, Dong Beom & Jeong, Seongjun, 2025. "CSR scores versus actual impacts: Banks’ main street lending during the great recession," Journal of Banking & Finance, Elsevier, vol. 172(C).
    13. Juan J. Cortina & Tatiana Didier & Sergio L. Schmukler, 2018. "Corporate debt maturity in developing countries: Sources of long and short‐termism," The World Economy, Wiley Blackwell, vol. 41(12), pages 3288-3316, December.
    14. Lkhagvadorj Munkhdalai & Tsendsuren Munkhdalai & Oyun-Erdene Namsrai & Jong Yun Lee & Keun Ho Ryu, 2019. "An Empirical Comparison of Machine-Learning Methods on Bank Client Credit Assessments," Sustainability, MDPI, vol. 11(3), pages 1-23, January.
    15. Intan Suryani Abu Bakar & Arifur Khan & Paul Mather & George Tanewski, 2020. "Board monitoring and covenant restrictiveness in private debt contracts during the global financial crisis," Accounting and Finance, Accounting and Finance Association of Australia and New Zealand, vol. 60(S1), pages 661-692, April.
    16. Düwel, Cornelia, 2013. "Repo funding and internal capital markets in the financial crisis," Discussion Papers 16/2013, Deutsche Bundesbank.
    17. Caroline Flammer & Ioannis Ioannou, 2021. "Strategic management during the financial crisis: How firms adjust their strategic investments in response to credit market disruptions," Strategic Management Journal, Wiley Blackwell, vol. 42(7), pages 1275-1298, July.
    18. David Aikman & Jonathan Bridges & Anil Kashyap & Caspar Siegert, 2019. "Would Macroprudential Regulation Have Prevented the Last Crisis?," Journal of Economic Perspectives, American Economic Association, vol. 33(1), pages 107-130, Winter.
    19. Nadia Ayed & Khemaies Bougatef, 2024. "Performance Assessment of Logistic Regression (LR), Artificial Neural Network (ANN), Fuzzy Inference System (FIS) and Adaptive Neuro-Fuzzy System (ANFIS) in Predicting Default Probability: The Case of," Computational Economics, Springer;Society for Computational Economics, vol. 64(3), pages 1803-1835, September.
    20. Xavier Vives, 2014. "Strategic Complementarity, Fragility, and Regulation," The Review of Financial Studies, Society for Financial Studies, vol. 27(12), pages 3547-3592.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:21:p:3423-:d:1511857. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.