IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v294y2021i2p711-722.html
   My bibliography  Save this article

How can lenders prosper? Comparing machine learning approaches to identify profitable peer-to-peer loan investments

Author

Listed:
  • Fitzpatrick, Trevor
  • Mues, Christophe

Abstract

Successful Peer-to-Peer (P2P) lending requires an evaluation of loan profitability from a large universe of loans. Predictions of loan profitability may be useful to rank potential investments. We investigate whether various types of prediction methods and the types of information contained in loan listing features matter for profitable investment. A range of methods and performance metrics are used to benchmark predictive performance, based on a large dataset of P2P loans issued on Lending Club. Robust linear mixed models are used to investigate performance differences between models, according to whether they assume linearity, whether they build ensembles, and which types of predictors they use. The main findings are that: linear methods perform surprisingly well on several (but not all) criteria; whether ensemble methods perform better than individual methods is measure dependent; the use of alternative text-based information does not improve profit scoring outcomes. We conclude that P2P lenders could potentially increase their investment returns by applying linear methods that directly predict the internal rate of return instead of other dependent variables such as loan default.

Suggested Citation

  • Fitzpatrick, Trevor & Mues, Christophe, 2021. "How can lenders prosper? Comparing machine learning approaches to identify profitable peer-to-peer loan investments," European Journal of Operational Research, Elsevier, vol. 294(2), pages 711-722.
  • Handle: RePEc:eee:ejores:v:294:y:2021:i:2:p:711-722
    DOI: 10.1016/j.ejor.2021.01.047
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221721000771
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2021.01.047?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Butaru, Florentin & Chen, Qingqing & Clark, Brian & Das, Sanmay & Lo, Andrew W. & Siddique, Akhtar, 2016. "Risk and risk management in the credit card industry," Journal of Banking & Finance, Elsevier, vol. 72(C), pages 218-239.
    2. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    3. Guo, Yanhong & Zhou, Wenjun & Luo, Chunyu & Liu, Chuanren & Xiong, Hui, 2016. "Instance-based credit risk assessment for investment decisions in P2P lending," European Journal of Operational Research, Elsevier, vol. 249(2), pages 417-426.
    4. Yaodong Yang & Alisa Kolesnikova & Stefan Lessmann & Tiejun Ma & Ming-Chien Sung & Johnnie E. V. Johnson, 2018. "Can Deep Learning Predict Risky Retail Investors? A Case Study in Financial Risk Behavior Forecasting," Papers 1812.06175, arXiv.org, revised Nov 2019.
    5. Verbraken, Thomas & Bravo, Cristián & Weber, Richard & Baesens, Bart, 2014. "Development and application of consumer credit scoring models using profit-based classification measures," European Journal of Operational Research, Elsevier, vol. 238(2), pages 505-513.
    6. Lessmann, Stefan & Voß, Stefan, 2017. "Car resale price forecasting: The impact of regression method, private information, and heterogeneity on forecast accuracy," International Journal of Forecasting, Elsevier, vol. 33(4), pages 864-877.
    7. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    8. Mevik, Björn-Helge & Wehrens, Ron, 2007. "The pls Package: Principal Component and Partial Least Squares Regression in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 18(i02).
    9. Dorfleitner, Gregor & Priberny, Christopher & Schuster, Stephanie & Stoiber, Johannes & Weber, Martina & de Castro, Ivan & Kammler, Julia, 2016. "Description-text related soft information in peer-to-peer lending – Evidence from two leading European platforms," Journal of Banking & Finance, Elsevier, vol. 64(C), pages 169-187.
    10. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    11. Julapa Jagtiani & Catharine Lemieux, 2019. "The roles of alternative data and machine learning in fintech lending: Evidence from the LendingClub consumer platform," Financial Management, Financial Management Association International, vol. 48(4), pages 1009-1029, December.
    12. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    13. Karatzoglou, Alexandros & Smola, Alexandros & Hornik, Kurt & Zeileis, Achim, 2004. "kernlab - An S4 Package for Kernel Methods in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 11(i09).
    14. Koller, Manuel, 2016. "robustlmm: An R Package for Robust Estimation of Linear Mixed-Effects Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 75(i06).
    15. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    16. Julapa Jagtiani & Catharine Lemieux, 2017. "Fintech Lending: Financial Inclusion, Risk Pricing, and Alternative Information," Working Papers 17-17, Federal Reserve Bank of Philadelphia.
    17. Miller, Sarah, 2015. "Information and default in consumer credit markets: Evidence from a natural experiment," Journal of Financial Intermediation, Elsevier, vol. 24(1), pages 45-70.
    18. Kolesnikova, A. & Yang, Y. & Lessmann, S. & Ma, T. & Sung, M.-C. & Johnson, J.E.V., 2019. "Can Deep Learning Predict Risky Retail Investors? A Case Study in Financial Risk Behavior Forecasting," IRTG 1792 Discussion Papers 2019-023, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    19. Jefferson Duarte & Stephan Siegel & Lance Young, 2012. "Trust and Credit: The Role of Appearance in Peer-to-peer Lending," Review of Financial Studies, Society for Financial Studies, vol. 25(8), pages 2455-2484.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kriebel, Johannes & Stitz, Lennart, 2022. "Credit default prediction from user-generated text in peer-to-peer lending using deep learning," European Journal of Operational Research, Elsevier, vol. 302(1), pages 309-323.
    2. Sha, Yezhou, 2022. "Rating manipulation and creditworthiness for platform economy: Evidence from peer-to-peer lending," International Review of Financial Analysis, Elsevier, vol. 84(C).
    3. Ligang Zhou & Chao Ma, 2023. "A Comparison of Different Rules on Loans Evaluation in Peer-to-Peer Lending by Gradient Boosting Models Under Moving Windows with Two Timestamps," Computational Economics, Springer;Society for Computational Economics, vol. 62(4), pages 1481-1504, December.
    4. Doumpos, Michalis & Zopounidis, Constantin & Gounopoulos, Dimitrios & Platanakis, Emmanouil & Zhang, Wenke, 2023. "Operational research and artificial intelligence methods in banking," European Journal of Operational Research, Elsevier, vol. 306(1), pages 1-16.
    5. Li, Zhiyong & Li, Aimin & Bellotti, Anthony & Yao, Xiao, 2023. "The profitability of online loans: A competing risks analysis on default and prepayment," European Journal of Operational Research, Elsevier, vol. 306(2), pages 968-985.
    6. Krivorotov, George, 2023. "Machine learning-based profit modeling for credit card underwriting - implications for credit risk," Journal of Banking & Finance, Elsevier, vol. 149(C).
    7. Mahsa Tavakoli & Rohitash Chandra & Fengrui Tian & Cristi'an Bravo, 2023. "Multi-Modal Deep Learning for Credit Rating Prediction Using Text and Numerical Data Streams," Papers 2304.10740, arXiv.org, revised Sep 2023.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Štefan Lyócsa & Petra Vašaničová & Branka Hadji Misheva & Marko Dávid Vateha, 2022. "Default or profit scoring credit systems? Evidence from European and US peer-to-peer lending markets," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-21, December.
    2. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    3. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    4. Michael Bucker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2020. "Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring," Papers 2009.13384, arXiv.org.
    5. Tomasz Rymarczyk & Krzysztof Król & Edward Kozłowski & Tomasz Wołowiec & Marta Cholewa-Wiktor & Piotr Bednarczuk, 2021. "Application of Electrical Tomography Imaging Using Machine Learning Methods for the Monitoring of Flood Embankments Leaks," Energies, MDPI, vol. 14(23), pages 1-35, December.
    6. Tanin Sirimongkolkasem & Reza Drikvandi, 2019. "On Regularisation Methods for Analysis of High Dimensional Data," Annals of Data Science, Springer, vol. 6(4), pages 737-763, December.
    7. Koen W. de Bock, 2017. "The best of two worlds: Balancing model strength and comprehensibility in business failure prediction using spline-rule ensembles," Post-Print hal-01588059, HAL.
    8. Doumpos, Michalis & Zopounidis, Constantin & Gounopoulos, Dimitrios & Platanakis, Emmanouil & Zhang, Wenke, 2023. "Operational research and artificial intelligence methods in banking," European Journal of Operational Research, Elsevier, vol. 306(1), pages 1-16.
    9. Peter Martey Addo & Dominique Guegan & Bertrand Hassani, 2018. "Credit Risk Analysis Using Machine and Deep Learning Models," Risks, MDPI, vol. 6(2), pages 1-20, April.
    10. Maria-Carmen García-Centeno & Román Mínguez-Salido & Raúl del Pozo-Rubio, 2021. "The Classification of Profiles of Financial Catastrophe Caused by Out-of-Pocket Payments: A Methodological Approach," Mathematics, MDPI, vol. 9(11), pages 1-20, May.
    11. Liu, He & Qiao, Han & Wang, Shouyang & Li, Yuze, 2019. "Platform Competition in Peer-to-Peer Lending Considering Risk Control Ability," European Journal of Operational Research, Elsevier, vol. 274(1), pages 280-290.
    12. Satre-Meloy, Aven & Diakonova, Marina & Grünewald, Philipp, 2020. "Cluster analysis and prediction of residential peak demand profiles using occupant activity data," Applied Energy, Elsevier, vol. 260(C).
    13. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    14. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
    15. Christopher J Greenwood & George J Youssef & Primrose Letcher & Jacqui A Macdonald & Lauryn J Hagg & Ann Sanson & Jenn Mcintosh & Delyse M Hutchinson & John W Toumbourou & Matthew Fuller-Tyszkiewicz &, 2020. "A comparison of penalised regression methods for informing the selection of predictive markers," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-14, November.
    16. Gustavo A. Alonso-Silverio & Víctor Francisco-García & Iris P. Guzmán-Guzmán & Elías Ventura-Molina & Antonio Alarcón-Paredes, 2021. "Toward Non-Invasive Estimation of Blood Glucose Concentration: A Comparative Performance," Mathematics, MDPI, vol. 9(20), pages 1-13, October.
    17. Christopher Kath & Florian Ziel, 2018. "The value of forecasts: Quantifying the economic gains of accurate quarter-hourly electricity price forecasts," Papers 1811.08604, arXiv.org.
    18. Gurgul Henryk & Machno Artur, 2017. "Trade Pattern on Warsaw Stock Exchange and Prediction of Number of Trades," Statistics in Transition New Series, Polish Statistical Association, vol. 18(1), pages 91-114, March.
    19. Michael Funke & Kadri Männasoo & Helery Tasane, 2023. "Regional Economic Impacts of the Øresund Cross-Border Fixed Link: Cui Bono?," CESifo Working Paper Series 10557, CESifo.
    20. Zichen Zhang & Ye Eun Bae & Jonathan R. Bradley & Lang Wu & Chong Wu, 2022. "SUMMIT: An integrative approach for better transcriptomic data imputation improves causal gene identification," Nature Communications, Nature, vol. 13(1), pages 1-12, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:294:y:2021:i:2:p:711-722. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.