IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v277y2019i1p20-31.html
   My bibliography  Save this article

A prediction-driven mixture cure model and its application in credit scoring

Author

Listed:
  • Jiang, Cuiqing
  • Wang, Zhao
  • Zhao, Huimin

Abstract

In the credit market, assessment of a borrower's default risk over time is essential to enabling timely risk management, since borrowers’ exposure to risk and the losses that result from defaults are strongly related to the time when they default. Mixture cure models, with their ability to predict not only whether borrowers will default but also when they are likely to default, have been applied to credit scoring. We propose a prediction-driven mixture cure model, which sacrifices interpretability for potentially better prediction performance, and apply it to credit scoring. In the incidence part of the mixture cure model, we substitute the typical statistical incidence model (i.e., logistic regression) with a more flexible, and hopefully more accurate, classification method (i.e., random forests). For the latency part, we propose a survival analysis model, named Time-Dependent Hazards, which accommodates a direct relationship between failure times and covariates and can potentially better predict the probability of default over time than the standard Cox PH model. Empirical evaluation using real-world data from a major P2P lending institution in China shows that both extensions contributed to performance improvement in both discrimination and calibration.

Suggested Citation

  • Jiang, Cuiqing & Wang, Zhao & Zhao, Huimin, 2019. "A prediction-driven mixture cure model and its application in credit scoring," European Journal of Operational Research, Elsevier, vol. 277(1), pages 20-31.
  • Handle: RePEc:eee:ejores:v:277:y:2019:i:1:p:20-31
    DOI: 10.1016/j.ejor.2019.01.072
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221719301092
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2019.01.072?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ma, Jun & Heritier, Stephane & Lô, Serigne N., 2014. "On the maximum penalized likelihood approach for proportional hazard models with right censored survival data," Computational Statistics & Data Analysis, Elsevier, vol. 74(C), pages 142-156.
    2. Guo, Yanhong & Zhou, Wenjun & Luo, Chunyu & Liu, Chuanren & Xiong, Hui, 2016. "Instance-based credit risk assessment for investment decisions in P2P lending," European Journal of Operational Research, Elsevier, vol. 249(2), pages 417-426.
    3. B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.
    4. D. J. Hand & W. E. Henley, 1997. "Statistical Classification Methods in Consumer Credit Scoring: a Review," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 160(3), pages 523-541, September.
    5. Eric Rosenberg & Alan Gleit, 1994. "Quantitative Methods in Credit Management: A Survey," Operations Research, INFORMS, vol. 42(4), pages 589-613, August.
    6. Zhezhen Jin, 2003. "Rank-based inference for the accelerated failure time model," Biometrika, Biometrika Trust, vol. 90(2), pages 341-353, June.
    7. Djeundje, Viani Biatat & Crook, Jonathan, 2019. "Dynamic survival models with varying coefficients for credit risks," European Journal of Operational Research, Elsevier, vol. 275(1), pages 319-333.
    8. D J Hand & M G Kelly, 2001. "Lookahead scorecards for new fixed term credit products," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 52(9), pages 989-996, September.
    9. Yingwei Peng & Keith B. G. Dear, 2000. "A Nonparametric Mixture Model for Cure Rate Estimation," Biometrics, The International Biometric Society, vol. 56(1), pages 237-243, March.
    10. Daniele De Leonardis & Roberto Rocci, 2014. "Default risk analysis via a discrete‐time cure rate model," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 30(5), pages 529-543, September.
    11. Tong, Edward N.C. & Mues, Christophe & Thomas, Lyn C., 2012. "Mixture cure models in credit scoring: If and when borrowers default," European Journal of Operational Research, Elsevier, vol. 218(1), pages 132-139.
    12. Hajjem, Ahlem & Bellavance, François & Larocque, Denis, 2011. "Mixed effects regression trees for clustered data," Statistics & Probability Letters, Elsevier, vol. 81(4), pages 451-459, April.
    13. T Bellotti & J Crook, 2009. "Credit scoring with macroeconomic variables using survival analysis," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 60(12), pages 1699-1707, December.
    14. J Banasik & J N Crook & L C Thomas, 1999. "Not if but when will borrowers default," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 50(12), pages 1185-1190, December.
    15. Judy P. Sy & Jeremy M. G. Taylor, 2000. "Estimation in a Cox Proportional Hazards Cure Model," Biometrics, The International Biometric Society, vol. 56(1), pages 227-236, March.
    16. Dirick, Lore & Claeskens, Gerda & Baesens, Bart, 2015. "An Akaike information criterion for multiple event mixture cure models," European Journal of Operational Research, Elsevier, vol. 241(2), pages 449-457.
    17. Bhattacharya, Arnab & Wilson, Simon P. & Soyer, Refik, 2019. "A Bayesian approach to modeling mortgage default and prepayment," European Journal of Operational Research, Elsevier, vol. 274(3), pages 1112-1124.
    18. Maria Stepanova & Lyn Thomas, 2002. "Survival Analysis Methods for Personal Loan Data," Operations Research, INFORMS, vol. 50(2), pages 277-289, April.
    19. Zhang, Jiajia & Peng, Yingwei, 2007. "An alternative estimation method for the accelerated failure time frailty model," Computational Statistics & Data Analysis, Elsevier, vol. 51(9), pages 4413-4423, May.
    20. Finlay, Steven, 2011. "Multiple classifier architectures and their application to credit risk assessment," European Journal of Operational Research, Elsevier, vol. 210(2), pages 368-378, April.
    21. J-K Im & D W Apley & C Qi & X Shan, 2012. "A time-dependent proportional hazards survival model for credit risk analysis," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 63(3), pages 306-321, March.
    22. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    23. Liu, Fan & Hua, Zhongsheng & Lim, Andrew, 2015. "Identifying future defaulters: A hierarchical Bayesian method," European Journal of Operational Research, Elsevier, vol. 241(1), pages 202-211.
    24. Djeundje, Viani Biatat & Crook, Jonathan, 2018. "Incorporating heterogeneity and macroeconomic variables into multi-state delinquency models for credit cards," European Journal of Operational Research, Elsevier, vol. 271(2), pages 697-709.
    25. Yung-Chia Chang & Kuei-Hu Chang & Heng-Hsuan Chu & Lee-Ing Tong, 2016. "Establishing decision tree-based short-term default credit risk assessment models," Communications in Statistics - Theory and Methods, Taylor & Francis Journals, vol. 45(23), pages 6803-6815, December.
    26. Zhang, Jie & Thomas, Lyn C., 2012. "Comparisons of linear regression and survival analysis using single and mixture distributions approaches in modelling LGD," International Journal of Forecasting, Elsevier, vol. 28(1), pages 204-215.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yuan Wang & Liping Yang & Jun Wu & Zisheng Song & Li Shi, 2022. "Mining Campus Big Data: Prediction of Career Choice Using Interpretable Machine Learning Method," Mathematics, MDPI, vol. 10(8), pages 1-18, April.
    2. Liu, Yi & Yang, Menglong & Wang, Yudong & Li, Yongshan & Xiong, Tiancheng & Li, Anzhe, 2022. "Applying machine learning algorithms to predict default probability in the online credit market: Evidence from China," International Review of Financial Analysis, Elsevier, vol. 79(C).
    3. Li, Aimin & Li, Zhiyong & Bellotti, Anthony, 2023. "Predicting loss given default of unsecured consumer loans with time-varying survival scores," Pacific-Basin Finance Journal, Elsevier, vol. 78(C).
    4. Gunnarsson, Björn Rafn & vanden Broucke, Seppe & Baesens, Bart & Óskarsdóttir, María & Lemahieu, Wilfried, 2021. "Deep learning for credit scoring: Do or don’t?," European Journal of Operational Research, Elsevier, vol. 295(1), pages 292-305.
    5. Haupt, Johannes & Lessmann, Stefan, 2022. "Targeting customers under response-dependent costs," European Journal of Operational Research, Elsevier, vol. 297(1), pages 369-379.
    6. Wang, Chengfu & Chen, Xiangfeng & Jin, Wei & Fan, Xiaojun, 2022. "Credit guarantee types for financing retailers through online peer-to-peer lending: Equilibrium and coordinating strategy," European Journal of Operational Research, Elsevier, vol. 297(1), pages 380-392.
    7. Yang, Qi & He, Haijin & Lu, Bin & Song, Xinyuan, 2022. "Mixture additive hazards cure model with latent variables: Application to corporate default data," Computational Statistics & Data Analysis, Elsevier, vol. 167(C).
    8. Silva, Diego M.B. & Pereira, Gustavo H.A. & Magalhães, Tiago M., 2022. "A class of categorization methods for credit scoring models," European Journal of Operational Research, Elsevier, vol. 296(1), pages 323-331.
    9. Ana Ezquerro & Brais Cancela & Ana López-Cheda, 2023. "On the Reliability of Machine Learning Models for Survival Analysis When Cure Is a Possibility," Mathematics, MDPI, vol. 11(19), pages 1-21, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Liu, Fan & Hua, Zhongsheng & Lim, Andrew, 2015. "Identifying future defaulters: A hierarchical Bayesian method," European Journal of Operational Research, Elsevier, vol. 241(1), pages 202-211.
    2. Lore Dirick & Gerda Claeskens & Bart Baesens, 2017. "Time to default in credit scoring using survival analysis: a benchmark study," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 68(6), pages 652-665, June.
    3. Dirick, Lore & Claeskens, Gerda & Vasnev, Andrey & Baesens, Bart, 2022. "A hierarchical mixture cure model with unobserved heterogeneity for credit risk," Econometrics and Statistics, Elsevier, vol. 22(C), pages 39-55.
    4. Tong, Edward N.C. & Mues, Christophe & Thomas, Lyn C., 2012. "Mixture cure models in credit scoring: If and when borrowers default," European Journal of Operational Research, Elsevier, vol. 218(1), pages 132-139.
    5. Li, Zhiyong & Li, Aimin & Bellotti, Anthony & Yao, Xiao, 2023. "The profitability of online loans: A competing risks analysis on default and prepayment," European Journal of Operational Research, Elsevier, vol. 306(2), pages 968-985.
    6. Richard Chamboko & Jorge M. Bravo, 2016. "On the modelling of prognosis from delinquency to normal performance on retail consumer loans," Risk Management, Palgrave Macmillan, vol. 18(4), pages 264-287, December.
    7. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    8. Calabrese, Raffaella & Crook, Jonathan, 2020. "Spatial contagion in mortgage defaults: A spatial dynamic survival model with time and space varying coefficients," European Journal of Operational Research, Elsevier, vol. 287(2), pages 749-761.
    9. Li, Aimin & Li, Zhiyong & Bellotti, Anthony, 2023. "Predicting loss given default of unsecured consumer loans with time-varying survival scores," Pacific-Basin Finance Journal, Elsevier, vol. 78(C).
    10. Dirick, Lore & Claeskens, Gerda & Baesens, Bart, 2015. "An Akaike information criterion for multiple event mixture cure models," European Journal of Operational Research, Elsevier, vol. 241(2), pages 449-457.
    11. Thi Mai Luong, 2020. "Selection Effects of Lender and Borrower Choices on Risk Measurement, Management and Prudential Regulation," PhD Thesis, Finance Discipline Group, UTS Business School, University of Technology, Sydney, number 3-2020.
    12. Bocchio, Cecilia & Crook, Jonathan & Andreeva, Galina, 2023. "The impact of macroeconomic scenarios on recurrent delinquency: A stress testing framework of multi-state models for mortgages," International Journal of Forecasting, Elsevier, vol. 39(4), pages 1655-1677.
    13. Richard Chamboko & Jorge Miguel Bravo, 2020. "A Multi-State Approach to Modelling Intermediate Events and Multiple Mortgage Loan Outcomes," Risks, MDPI, vol. 8(2), pages 1-29, June.
    14. L C Thomas, 2010. "Consumer finance: challenges for operational research," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 61(1), pages 41-52, January.
    15. Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
    16. Xu, Linzhi & Zhang, Jiajia, 2010. "Multiple imputation method for the semiparametric accelerated failure time mixture cure model," Computational Statistics & Data Analysis, Elsevier, vol. 54(7), pages 1808-1816, July.
    17. Zhao Wang & Cuiqing Jiang & Huimin Zhao, 2022. "Know Where to Invest: Platform Risk Evaluation in Online Lending," Information Systems Research, INFORMS, vol. 33(3), pages 765-783, September.
    18. Rais Ahmad Itoo & A. Selvarasu & José António Filipe, 2015. "Loan Products and Credit Scoring by Commercial Banks (India)," International Journal of Finance, Insurance and Risk Management, International Journal of Finance, Insurance and Risk Management, vol. 5(1), pages 851-851.
    19. Andreea Costea, 2017. "A Quantitative Approach to Credit Risk Management in the Underwriting Process for the Retail Portfolio," Romanian Economic Journal, Department of International Business and Economics from the Academy of Economic Studies Bucharest, vol. 20(63), pages 157-186, March.
    20. Chen, Dangxing & Ye, Jiahui & Ye, Weicheng, 2023. "Interpretable selective learning in credit risk," Research in International Business and Finance, Elsevier, vol. 65(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:277:y:2019:i:1:p:20-31. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.