IDEAS home Printed from https://ideas.repec.org/a/vrs/manmar/v15y2020i3p393-409n4.html
   My bibliography  Save this article

Will they repay their debt? Identification of borrowers likely to be charged off

Author

Listed:
  • Caplescu Raluca Dana

    (Bucharest University of Economic Studies,Bucharest, Romania)

  • Panaite Ana-Maria

    (Bucharest University of Economic Studies,Bucharest, Romania)

  • Pele Daniel Traian

    (Bucharest University of Economic Studies,Bucharest, Romania)

  • Strat Vasile Alecsandru

    (Bucharest University of Economic Studies,Bucharest, Romania)

Abstract

Recent increase in peer-to-peer lending prompted for development of models to separate good and bad clients to mitigate risks both for lenders and for the platforms. The rapidly increasing body of literature provides several comparisons between various models. Among the most frequently employed ones are logistic regression, Support Vector Machines, neural networks and decision tree-based models. Among them, logistic regression has proved to be a strong candidate both because its good performance and due to its high explainability. The present paper aims to compare four pairs of models (for imbalanced and under-sampled data) meant to predict charged off clients by optimizing F1 score. We found that, if the data is balanced, Logistic Regression, both simple and with Stochastic Gradient Descent, outperforms LightGBM and K-Nearest Neighbors in optimizing F1 score. We chose this metric as it provides balance between the interests of the lenders and those of the platform. Loan term, debt-to-income ratio and number of accounts were found to be important positively related predictors of risk of charge off. At the other end of the spectrum, by far the strongest impact on charge off probability is that of the FICO score. The final number of features retained by the two models differs very much, because, although both models use Lasso for feature selection, Stochastic Gradient Descent Logistic Regression uses a stronger regularization. The analysis was performed using Python (numpy, pandas, sklearn and imblearn).

Suggested Citation

  • Caplescu Raluca Dana & Panaite Ana-Maria & Pele Daniel Traian & Strat Vasile Alecsandru, 2020. "Will they repay their debt? Identification of borrowers likely to be charged off," Management & Marketing, Sciendo, vol. 15(3), pages 393-409, September.
  • Handle: RePEc:vrs:manmar:v:15:y:2020:i:3:p:393-409:n:4
    DOI: 10.2478/mmcks-2020-0023
    as

    Download full text from publisher

    File URL: https://doi.org/10.2478/mmcks-2020-0023
    Download Restriction: no

    File URL: https://libkey.io/10.2478/mmcks-2020-0023?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. repec:agr:journl:v:4(621):y:2019:i:4(621):p:75-84 is not listed on IDEAS
    2. Eduard Sariev & Guido Germano, 2020. "Bayesian regularized artificial neural networks for the estimation of the probability of default," Quantitative Finance, Taylor & Francis Journals, vol. 20(2), pages 311-328, February.
    3. Selçuk BAYRACI & Orkun SUSUZ, 2019. "A Deep Neural Network (DNN) based classification model in application to loan default prediction," Theoretical and Applied Economics, Asociatia Generala a Economistilor din Romania - AGER, vol. 0(4(621), W), pages 75-84, Winter.
    4. Petr Gurný & Martin Gurný, 2013. "Comparison of Credit Scoring Models on Probability of Default Estimation for Us Banks," Prague Economic Papers, Prague University of Economics and Business, vol. 2013(2), pages 163-181.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Caplescu Raluca Dana & Cojocea Manuela-Simona & Pele Daniel Traian & Strat Vasile Alecsandru, 2021. "Improvements in PD models. A case-study approach," Proceedings of the International Conference on Business Excellence, Sciendo, vol. 15(1), pages 13-32, December.
    2. Timothy Praditia & Thilo Walser & Sergey Oladyshkin & Wolfgang Nowak, 2020. "Improving Thermochemical Energy Storage Dynamics Forecast with Physics-Inspired Neural Network Architecture," Energies, MDPI, vol. 13(15), pages 1-26, July.
    3. Michael L. Polemis & Mike G. Tsionas, 2023. "The environmental consequences of blockchain technology: A Bayesian quantile cointegration analysis for Bitcoin," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 28(2), pages 1602-1621, April.
    4. Juan Rafael Ruiz & Patricia Stupariu & Ángel Vilariño, 2024. "The weakest links in the crisis of the Spanish Savings Banks," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 29(1), pages 654-664, January.
    5. Anastasios Petropoulos & Vasilis Siakoulis & Evaggelos Stavroulakis & Aristotelis Klamargias, 2019. "A robust machine learning approach for credit risk analysis of large loan level datasets using deep learning and extreme gradient boosting," IFC Bulletins chapters, in: Bank for International Settlements (ed.), Are post-crisis statistical initiatives completed?, volume 49, Bank for International Settlements.
    6. Anastasios Petropoulos & Vasilis Siakoulis & Evaggelos Stavroulakis & Aristotelis Klamargias, 2019. "A robust machine learning approach for credit risk analysis of large loan-level datasets using deep learning and extreme gradient boosting," IFC Bulletins chapters, in: Bank for International Settlements (ed.), The use of big data analytics and artificial intelligence in central banking, volume 50, Bank for International Settlements.
    7. Irving Fisher Committee, 2019. "The use of big data analytics and artificial intelligence in central banking," IFC Bulletins, Bank for International Settlements, number 50.
    8. Salman Bahoo & Marco Cucculelli & Xhoana Goga & Jasmine Mondolo, 2024. "Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis," SN Business & Economics, Springer, vol. 4(2), pages 1-46, February.
    9. A. R. Provenzano & D. Trifir`o & A. Datteo & L. Giada & N. Jean & A. Riciputi & G. Le Pera & M. Spadaccino & L. Massaron & C. Nordio, 2020. "Machine Learning approach for Credit Scoring," Papers 2008.01687, arXiv.org.
    10. Sergio Edwin Torrico Salamanca, 2014. "Macro credit scoring as a proposal for quantifying credit risk," Investigación & Desarrollo 0814, Universidad Privada Boliviana, revised Nov 2014.
    11. D. Bidzhoyan S. & Д. Биджоян С., 2018. "Модель Оценки Вероятности Отзыва Лицензии У Российского Банка // Model For Assessing The Probability Of Revocation Of A License From The Russian Bank," Финансы: теория и практика/Finance: Theory and Practice // Finance: Theory and Practice, ФГОБУВО Финансовый университет при Правительстве Российской Федерации // Financial University under The Government of Russian Federation, vol. 22(2), pages 26-37.
    12. Wei Li & Florentina Paraschiv & Georgios Sermpinis, 2022. "A data-driven explainable case-based reasoning approach for financial risk detection," Quantitative Finance, Taylor & Francis Journals, vol. 22(12), pages 2257-2274, December.
    13. Jaewon Park & Minsoo Shin & Wookjae Heo, 2021. "Estimating the BIS Capital Adequacy Ratio for Korean Banks Using Machine Learning: Predicting by Variable Selection Using Random Forest Algorithms," Risks, MDPI, vol. 9(2), pages 1-19, February.
    14. Vikram Ojha & JeongHoe Lee, 2021. "Default analysis in mortgage risk with conventional and deep machine learning focusing on 2008–2009," Digital Finance, Springer, vol. 3(3), pages 249-271, December.
    15. Sabek Amine, 2023. "Unveiling the diverse efficacy of artificial neural networks and logistic regression: A comparative analysis in predicting financial distress," Croatian Review of Economic, Business and Social Statistics, Sciendo, vol. 9(1), pages 16-32, July.
    16. Haris Doukas & Panos Xidonas & Nikos Mastromichalakis, 2022. "How Successful are Energy Efficiency Investments? A Comparative Analysis for Classification & Performance Prediction," Computational Economics, Springer;Society for Computational Economics, vol. 59(2), pages 579-598, February.
    17. Anita Nandi & Partha Pratim Sengupta & Abhijit Dutta, 2019. "Diagnosing the Financial Distress in Oil Drilling and Exploration Sector of India through Discriminant Analysis," Vision, , vol. 23(4), pages 364-373, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:vrs:manmar:v:15:y:2020:i:3:p:393-409:n:4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.sciendo.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.