IDEAS home Printed from https://ideas.repec.org/a/rfe/zbefri/v39y2021i1p163-197.html
   My bibliography  Save this article

An implementation of ensemble methods, logistic regression, and neural network for default prediction in Peer-to-Peer lending

Author

Listed:
  • Aneta Dzik-Walczak

    (University of Warsaw – Faculty of Economic Sciences, D³uga 44/50, 00-241 Warsaw, Poland)

  • Mateusz Heba

    (University of Warsaw – Faculty of Economic Sciences, D³uga 44/50, 00-241 Warsaw, Poland)

Abstract

Credit scoring has become an important issue because competition among financial institutions is intense and even a small improvement in predictive accuracy can result in significant savings. Financial institutions are looking for optimal strategies using credit scoring models. Therefore, credit scoring tools are extensively studied. As a result, various parametric statistical methods, non-parametric statistical tools and soft computing approaches have been developed to improve the accuracy of credit scoring models. In this paper, different approaches are used to classify customers into those who repay the loan and those who default on a loan. The purpose of this study is to investigate the performance of two credit scoring techniques, the logistic regression model estimated on categorized variables modified with the use of WOE (Weight of Evidence) transformation, and neural networks. We also combine multiple classifiers and test whether ensemble learning has better performance. To evaluate the feasibility and effectiveness of these methods, the analysis is performed on Lending Club data. In addition, we investigate Peer-to-peer lending, also called social lending. From the results, it can be concluded that the logistic regression model can provide better performance than neural networks. The proposed ensemble model (a combination of logistic regression and neural network by averaging the probabilities obtained from both models) has higher AUC, Gini coefficient and Kolmogorov-Smirnov statistics compared to other models. Therefore, we can conclude that the ensemble model allows to successfully reduce the potential risks of losses due to misclassification costs.

Suggested Citation

  • Aneta Dzik-Walczak & Mateusz Heba, 2021. "An implementation of ensemble methods, logistic regression, and neural network for default prediction in Peer-to-Peer lending," Zbornik radova Ekonomskog fakulteta u Rijeci/Proceedings of Rijeka Faculty of Economics, University of Rijeka, Faculty of Economics and Business, vol. 39(1), pages 163-197.
  • Handle: RePEc:rfe:zbefri:v:39:y:2021:i:1:p:163-197
    as

    Download full text from publisher

    File URL: https://www.efri.uniri.hr/upload/Zbornik%201_2021/01-Dzik-Walczak_et_al-2021-1.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fitzpatrick, Trevor & Mues, Christophe, 2016. "An empirical comparison of classification algorithms for mortgage default prediction: evidence from a distressed mortgage market," European Journal of Operational Research, Elsevier, vol. 249(2), pages 427-439.
    2. Evžen Kocenda & Martin Vojtek, 2011. "Default Predictors in Retail Credit Scoring: Evidence from Czech Banking Data," Emerging Markets Finance and Trade, Taylor & Francis Journals, vol. 47(6), pages 80-98, November.
    3. D. J. Hand & W. E. Henley, 1997. "Statistical Classification Methods in Consumer Credit Scoring: a Review," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 160(3), pages 523-541, September.
    4. D Wu & D L Olson, 2010. "Enterprise risk management: coping with model risk in a large bank," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 61(2), pages 179-190, February.
    5. Akkoç, Soner, 2012. "An empirical comparison of conventional techniques, neural networks and the three stage hybrid Adaptive Neuro Fuzzy Inference System (ANFIS) model for credit scoring analysis: The case of Turkish cred," European Journal of Operational Research, Elsevier, vol. 222(1), pages 168-178.
    6. Huseyin Ince & Bora Aktan, 2009. "A comparison of data mining techniques for credit scoring in banking: A managerial perspective," Journal of Business Economics and Management, Taylor & Francis Journals, vol. 10(3), pages 233-240, March.
    7. Carlos Serrano-Cinca & Begoña Gutiérrez-Nieto & Luz López-Palacios, 2015. "Determinants of Default in P2P Lending," PLOS ONE, Public Library of Science, vol. 10(10), pages 1-22, October.
    8. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    9. Riza Emekter & Yanbin Tu & Benjamas Jirasakuldech & Min Lu, 2015. "Evaluating credit risk and loan performance in online Peer-to-Peer (P2P) lending," Applied Economics, Taylor & Francis Journals, vol. 47(1), pages 54-70, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Aneta Dzik-Walczak & Mateusz Heba, 2019. "A comparison of credit scoring techniques in Peer-to-Peer lending," Working Papers 2019-16, Faculty of Economic Sciences, University of Warsaw.
    2. Teply, Petr & Polena, Michal, 2020. "Best classification algorithms in peer-to-peer lending," The North American Journal of Economics and Finance, Elsevier, vol. 51(C).
    3. Li, Yibei & Wang, Ximei & Djehiche, Boualem & Hu, Xiaoming, 2020. "Credit scoring by incorporating dynamic networked information," European Journal of Operational Research, Elsevier, vol. 286(3), pages 1103-1112.
    4. Xia, Yufei & Zhao, Junhao & He, Lingyun & Li, Yinguo & Yang, Xiaoli, 2021. "Forecasting loss given default for peer-to-peer loans via heterogeneous stacking ensemble approach," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1590-1613.
    5. Liu, Yi & Yang, Menglong & Wang, Yudong & Li, Yongshan & Xiong, Tiancheng & Li, Anzhe, 2022. "Applying machine learning algorithms to predict default probability in the online credit market: Evidence from China," International Review of Financial Analysis, Elsevier, vol. 79(C).
    6. Carlos Serrano-Cinca & Begoña Gutiérrez-Nieto & Luz López-Palacios, 2015. "Determinants of Default in P2P Lending," PLOS ONE, Public Library of Science, vol. 10(10), pages 1-22, October.
    7. Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
    8. Mingfeng Tang & Mei Mei & Cuiwen Li & Xingyang Lv & Xushuang Li & Lihao Wang, 2020. "How does an individual’s default behavior on an online peer-to-peer lending platform influence an observer’s default intention?," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 6(1), pages 1-20, December.
    9. Ulf Römer & Oliver Musshoff, 2017. "Can agricultural credit scoring for microfinance institutions be implemented and improved by weather data?," Agricultural Finance Review, Emerald Group Publishing Limited, vol. 78(1), pages 83-97, December.
    10. Croux, Christophe & Jagtiani, Julapa & Korivi, Tarunsai & Vulanovic, Milos, 2020. "Important factors determining Fintech loan default: Evidence from a lendingclub consumer platform," Journal of Economic Behavior & Organization, Elsevier, vol. 173(C), pages 270-296.
    11. Gunnarsson, Björn Rafn & vanden Broucke, Seppe & Baesens, Bart & Óskarsdóttir, María & Lemahieu, Wilfried, 2021. "Deep learning for credit scoring: Do or don’t?," European Journal of Operational Research, Elsevier, vol. 295(1), pages 292-305.
    12. Štefan Lyócsa & Petra Vašaničová & Branka Hadji Misheva & Marko Dávid Vateha, 2022. "Default or profit scoring credit systems? Evidence from European and US peer-to-peer lending markets," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-21, December.
    13. Rasa Kanapickiene & Renatas Spicas, 2019. "Credit Risk Assessment Model for Small and Micro-Enterprises: The Case of Lithuania," Risks, MDPI, vol. 7(2), pages 1-23, June.
    14. Xueru Chen & Xiaoji Hu & Shenglin Ben, 2021. "How do reputation, structure design and FinTech ecosystem affect the net cash inflow of P2P lending platforms? Evidence from China," Electronic Commerce Research, Springer, vol. 21(4), pages 1055-1082, December.
    15. Michael Bucker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2020. "Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring," Papers 2009.13384, arXiv.org.
    16. Yuan, Kunpeng & Chi, Guotai & Zhou, Ying & Yin, Hailei, 2022. "A novel two-stage hybrid default prediction model with k-means clustering and support vector domain description," Research in International Business and Finance, Elsevier, vol. 59(C).
    17. Ha-Thu Nguyen, 2015. "How is credit scoring used to predict default in China?," EconomiX Working Papers 2015-1, University of Paris Nanterre, EconomiX.
    18. Kriebel, Johannes & Stitz, Lennart, 2022. "Credit default prediction from user-generated text in peer-to-peer lending using deep learning," European Journal of Operational Research, Elsevier, vol. 302(1), pages 309-323.
    19. Zhao Wang & Cuiqing Jiang & Huimin Zhao, 2022. "Know Where to Invest: Platform Risk Evaluation in Online Lending," Information Systems Research, INFORMS, vol. 33(3), pages 765-783, September.
    20. Ligang Zhou & Chao Ma, 2023. "A Comparison of Different Rules on Loans Evaluation in Peer-to-Peer Lending by Gradient Boosting Models Under Moving Windows with Two Timestamps," Computational Economics, Springer;Society for Computational Economics, vol. 62(4), pages 1481-1504, December.

    More about this item

    Keywords

    credit scoring; ensemble methods; logistic regression; neural nets; peer-to-peer lending;
    All these keywords.

    JEL classification:

    • G21 - Financial Economics - - Financial Institutions and Services - - - Banks; Other Depository Institutions; Micro Finance Institutions; Mortgages
    • G32 - Financial Economics - - Corporate Finance and Governance - - - Financing Policy; Financial Risk and Risk Management; Capital and Ownership Structure; Value of Firms; Goodwill

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:rfe:zbefri:v:39:y:2021:i:1:p:163-197. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Danijela Ujcic (email available below). General contact details of provider: https://edirc.repec.org/data/efrijhr.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.