IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v249y2016i2p427-439.html
   My bibliography  Save this article

An empirical comparison of classification algorithms for mortgage default prediction: evidence from a distressed mortgage market

Author

Listed:
  • Fitzpatrick, Trevor
  • Mues, Christophe

Abstract

This paper evaluates the performance of a number of modelling approaches for future mortgage default status. Boosted regression trees, random forests, penalised linear and semi-parametric logistic regression models are applied to four portfolios of over 300,000 Irish owner-occupier mortgages. The main findings are that the selected approaches have varying degrees of predictive power and that boosted regression trees significantly outperform logistic regression. This suggests that boosted regression trees can be a useful addition to the current toolkit for mortgage credit risk assessment by banks and regulators.

Suggested Citation

  • Fitzpatrick, Trevor & Mues, Christophe, 2016. "An empirical comparison of classification algorithms for mortgage default prediction: evidence from a distressed mortgage market," European Journal of Operational Research, Elsevier, vol. 249(2), pages 427-439.
  • Handle: RePEc:eee:ejores:v:249:y:2016:i:2:p:427-439
    DOI: 10.1016/j.ejor.2015.09.014
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221715008383
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2015.09.014?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hand, David J., 2009. "Mining the past to determine the future: Problems and possibilities," International Journal of Forecasting, Elsevier, vol. 25(3), pages 441-451, July.
    2. Horton, Nicholas J. & Kleinman, Ken P., 2007. "Much Ado About Nothing: A Comparison of Missing Data Methods and Software to Fit Incomplete Data Regression Models," The American Statistician, American Statistical Association, vol. 61, pages 79-90, February.
    3. Galindo, J & Tamayo, P, 2000. "Credit Risk Assessment Using Statistical and Machine Learning: Basic Methodology and Risk Modeling Applications," Computational Economics, Springer;Society for Computational Economics, vol. 15(1-2), pages 107-143, April.
    4. B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.
    5. Martens, David & Baesens, Bart & Van Gestel, Tony & Vanthienen, Jan, 2007. "Comprehensible credit scoring models using rule extraction from support vector machines," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1466-1476, December.
    6. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    7. Kennedy, Gerard & McIndoe-Calder, Tara, 2012. "The Irish Mortgage Market: Stylised Facts, Negative Equity and Arrears," Quarterly Bulletin Articles, Central Bank of Ireland, pages 85-108, February.
    8. Khandani, Amir E. & Kim, Adlar J. & Lo, Andrew W., 2010. "Consumer credit-risk models via machine-learning algorithms," Journal of Banking & Finance, Elsevier, vol. 34(11), pages 2767-2787, November.
    9. Haughwout, Andrew & Peach, Richard & Tracy, Joseph, 2008. "Juvenile delinquent mortgages: Bad credit or bad economy?," Journal of Urban Economics, Elsevier, vol. 64(2), pages 246-257, September.
    10. Bastos, Joao, 2007. "Credit scoring with boosted decision trees," MPRA Paper 8034, University Library of Munich, Germany.
    11. Kuhn, Max, 2008. "Building Predictive Models in R Using the caret Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 28(i05).
    12. Yongheng Deng & John M. Quigley & Robert Van Order, 2000. "Mortgage Terminations, Heterogeneity and the Exercise of Mortgage Options," Econometrica, Econometric Society, vol. 68(2), pages 275-308, March.
    13. De Bock, Koen W. & Coussement, Kristof & Van den Poel, Dirk, 2010. "Ensemble classification based on generalized additive models," Computational Statistics & Data Analysis, Elsevier, vol. 54(6), pages 1535-1546, June.
    14. Daniel Berg, 2007. "Bankruptcy prediction by generalized additive models," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 23(2), pages 129-143, March.
    15. David Feldman & Shulamith Gross, 2005. "Mortgage Default: Classification Trees Analysis," The Journal of Real Estate Finance and Economics, Springer, vol. 30(4), pages 369-396, June.
    16. T Bellotti & J Crook, 2009. "Credit scoring with macroeconomic variables using survival analysis," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 60(12), pages 1699-1707, December.
    17. K Kennedy & B Mac Namee & S J Delany, 2013. "Using semi-supervised classifiers for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 64(4), pages 513-529, April.
    18. Medema, Lydian & Koning, Ruud H. & Lensink, Robert, 2009. "A practical approach to validating a PD model," Journal of Banking & Finance, Elsevier, vol. 33(4), pages 701-708, April.
    19. Foote, Christopher L. & Gerardi, Kristopher & Willen, Paul S., 2008. "Negative equity and foreclosure: Theory and evidence," Journal of Urban Economics, Elsevier, vol. 64(2), pages 234-245, September.
    20. Das, Sanjiv R., 2012. "The Principal Principle," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 47(6), pages 1215-1246, December.
    21. Das, Sanjiv R. & Meadows, Ray, 2013. "Strategic loan modification: An options-based response to strategic default," Journal of Banking & Finance, Elsevier, vol. 37(2), pages 636-647.
    22. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    23. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    24. Hand, David J., 2009. "Mining the past to determine the future: Rejoinder," International Journal of Forecasting, Elsevier, vol. 25(3), pages 461-462, July.
    25. Crook, Jonathan N. & Edelman, David B. & Thomas, Lyn C., 2007. "Recent developments in consumer credit risk assessment," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1447-1465, December.
    26. Clifford M. Hurvich & Jeffrey S. Simonoff & Chih‐Ling Tsai, 1998. "Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(2), pages 271-293.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vikram Ojha & JeongHoe Lee, 2021. "Default analysis in mortgage risk with conventional and deep machine learning focusing on 2008–2009," Digital Finance, Springer, vol. 3(3), pages 249-271, December.
    2. Kriebel, Johannes & Stitz, Lennart, 2022. "Credit default prediction from user-generated text in peer-to-peer lending using deep learning," European Journal of Operational Research, Elsevier, vol. 302(1), pages 309-323.
    3. Chi Ming Chen & Geoffrey Kwok Fai Tso & Kaijian He, 2024. "Quantum Optimized Cost Based Feature Selection and Credit Scoring for Mobile Micro-financing," Computational Economics, Springer;Society for Computational Economics, vol. 63(2), pages 919-950, February.
    4. Donglin Wang & Don Hong & Qiang Wu, 2023. "Prediction of Loan Rate for Mortgage Data: Deep Learning Versus Robust Regression," Computational Economics, Springer;Society for Computational Economics, vol. 61(3), pages 1137-1150, March.
    5. Yosi Borochov & Boris A. Portnov, 2021. "Estimating Environmentally Adjusted Risks of Mortgage Arrears for Different Socioeconomic Groups of Borrowers," European Research Studies Journal, European Research Studies Journal, vol. 0(2), pages 595-620.
    6. Medina-Olivares, Victor & Lindgren, Finn & Calabrese, Raffaella & Crook, Jonathan, 2023. "Joint models of multivariate longitudinal outcomes and discrete survival data with INLA: An application to credit repayment behaviour," European Journal of Operational Research, Elsevier, vol. 310(2), pages 860-873.
    7. Luong, Thi Mai & Scheule, Harald, 2022. "Benchmarking forecast approaches for mortgage credit risk for forward periods," European Journal of Operational Research, Elsevier, vol. 299(2), pages 750-767.
    8. Haskamp, Ulrich, 2017. "Improving the forecasts of European regional banks' profitability with machine learning algorithms," Ruhr Economic Papers 705, RWI - Leibniz-Institut für Wirtschaftsforschung, Ruhr-University Bochum, TU Dortmund University, University of Duisburg-Essen.
    9. Choi, So Eun & Jang, Hyun Jin & Lee, Kyungsub & Zheng, Harry, 2021. "Optimal market-Making strategies under synchronised order arrivals with deep neural networks," Journal of Economic Dynamics and Control, Elsevier, vol. 125(C).
    10. Masci, Chiara & Johnes, Geraint & Agasisti, Tommaso, 2018. "Student and school performance across countries: A machine learning approach," European Journal of Operational Research, Elsevier, vol. 269(3), pages 1072-1085.
    11. Chen, Shunqin & Guo, Zhengfeng & Zhao, Xinlei, 2021. "Predicting mortgage early delinquency with machine learning methods," European Journal of Operational Research, Elsevier, vol. 290(1), pages 358-372.
    12. Zeineb Affes & Rania Hentati-Kaffel, 2019. "Predicting US Banks Bankruptcy: Logit Versus Canonical Discriminant Analysis," Computational Economics, Springer;Society for Computational Economics, vol. 54(1), pages 199-244, June.
    13. Aneta Dzik-Walczak & Mateusz Heba, 2019. "A comparison of credit scoring techniques in Peer-to-Peer lending," Working Papers 2019-16, Faculty of Economic Sciences, University of Warsaw.
    14. Gupta, Mukul & Kumar, Pradeep, 2020. "Recommendation generation using personalized weight of meta-paths in heterogeneous information networks," European Journal of Operational Research, Elsevier, vol. 284(2), pages 660-674.
    15. Sheikh Rabiul Islam & William Eberle & Sheikh K. Ghafoor & Sid C. Bundy & Douglas A. Talbert & Ambareen Siraj, 2019. "Investigating bankruptcy prediction models in the presence of extreme class imbalance and multiple stages of economy," Papers 1911.09858, arXiv.org.
    16. Justin Sirignano & Apaar Sadhwani & Kay Giesecke, 2016. "Deep Learning for Mortgage Risk," Papers 1607.02470, arXiv.org, revised Mar 2018.
    17. Ting Sun & Miklos A. Vasarhelyi, 2018. "Predicting credit card delinquencies: An application of deep neural networks," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 25(4), pages 174-189, October.
    18. Bhattacharya, Arnab & Wilson, Simon P. & Soyer, Refik, 2019. "A Bayesian approach to modeling mortgage default and prepayment," European Journal of Operational Research, Elsevier, vol. 274(3), pages 1112-1124.
    19. Kolesnikova, A. & Yang, Y. & Lessmann, S. & Ma, T. & Sung, M.-C. & Johnson, J.E.V., 2019. "Can Deep Learning Predict Risky Retail Investors? A Case Study in Financial Risk Behavior Forecasting," IRTG 1792 Discussion Papers 2019-023, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    20. Michael Bucker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2020. "Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring," Papers 2009.13384, arXiv.org.
    21. Jun†Tae Han & Jae†Seok Choi & Myeon†Jung Kim & Jina Jeong, 2018. "Developing a Risk Group Predictive Model for Korean Students Falling into Bad Debt," Asian Economic Journal, East Asian Economic Association, vol. 32(1), pages 3-14, March.
    22. Chen, Yujia & Calabrese, Raffaella & Martin-Barragan, Belen, 2024. "Interpretable machine learning for imbalanced credit scoring datasets," European Journal of Operational Research, Elsevier, vol. 312(1), pages 357-372.
    23. Yuan, Kunpeng & Chi, Guotai & Zhou, Ying & Yin, Hailei, 2022. "A novel two-stage hybrid default prediction model with k-means clustering and support vector domain description," Research in International Business and Finance, Elsevier, vol. 59(C).
    24. Jabeur, Sami Ben & Gharib, Cheima & Mefteh-Wali, Salma & Arfi, Wissal Ben, 2021. "CatBoost model and artificial intelligence techniques for corporate failure prediction," Technological Forecasting and Social Change, Elsevier, vol. 166(C).
    25. Aneta Dzik-Walczak & Mateusz Heba, 2021. "An implementation of ensemble methods, logistic regression, and neural network for default prediction in Peer-to-Peer lending," Zbornik radova Ekonomskog fakulteta u Rijeci/Proceedings of Rijeka Faculty of Economics, University of Rijeka, Faculty of Economics and Business, vol. 39(1), pages 163-197.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chen, Shunqin & Guo, Zhengfeng & Zhao, Xinlei, 2021. "Predicting mortgage early delinquency with machine learning methods," European Journal of Operational Research, Elsevier, vol. 290(1), pages 358-372.
    2. Richard Chamboko & Jorge Miguel Bravo, 2020. "A Multi-State Approach to Modelling Intermediate Events and Multiple Mortgage Loan Outcomes," Risks, MDPI, vol. 8(2), pages 1-29, June.
    3. Crone, Sven F. & Finlay, Steven, 2012. "Instance sampling in credit scoring: An empirical study of sample size and balancing," International Journal of Forecasting, Elsevier, vol. 28(1), pages 224-238.
    4. Stefan Lessmann & Stefan Voß, 2010. "Customer-Centric Decision Support," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 2(2), pages 79-93, April.
    5. Dimitris Andriosopoulos & Michalis Doumpos & Panos M. Pardalos & Constantin Zopounidis, 2019. "Computational approaches and data analytics in financial services: A literature review," Journal of the Operational Research Society, Taylor & Francis Journals, vol. 70(10), pages 1581-1599, October.
    6. TOBBACK, Ellen & MARTENS, David, 2017. "Retail credit scoring using fine-grained payment data," Working Papers 2017011, University of Antwerp, Faculty of Business and Economics.
    7. Reamonn Lyndon & Yvonne McCarthy, 2013. "What Lies Beneath? Understanding Recent Trends in Irish Mortgage Arrears," The Economic and Social Review, Economic and Social Studies, vol. 44(1), pages 117-150.
    8. Asish Saha & Hock-Eam Lim & Goh-Yeok Siew, 2021. "Housing Loan Repayment Behaviour in Malaysia: An Analytical Insight," International Journal of Business and Economics, School of Management Development, Feng Chia University, Taichung, Taiwan, vol. 20(2), pages 1-19, September.
    9. Goodstein, Ryan & Hanouna, Paul & Ramirez, Carlos D. & Stahel, Christof W., 2017. "Contagion effects in strategic mortgage defaults," Journal of Financial Intermediation, Elsevier, vol. 30(C), pages 50-60.
    10. Matthias Bogaert & Lex Delaere, 2023. "Ensemble Methods in Customer Churn Prediction: A Comparative Analysis of the State-of-the-Art," Mathematics, MDPI, vol. 11(5), pages 1-28, February.
    11. Michael Bucker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2020. "Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring," Papers 2009.13384, arXiv.org.
    12. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    13. Juan Laborda & Seyong Ryoo, 2021. "Feature Selection in a Credit Scoring Model," Mathematics, MDPI, vol. 9(7), pages 1-22, March.
    14. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    15. Mocetti, Sauro & Viviano, Eliana, 2017. "Looking behind mortgage delinquencies," Journal of Banking & Finance, Elsevier, vol. 75(C), pages 53-63.
    16. K. W. De Bock & D. Van Den Poel, 2012. "Reconciling Performance and Interpretability in Customer Churn Prediction using Ensemble Learning based on Generalized Additive Models," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 12/805, Ghent University, Faculty of Economics and Business Administration.
    17. Olson, Luke M. & Qi, Min & Zhang, Xiaofei & Zhao, Xinlei, 2021. "Machine learning loss given default for corporate debt," Journal of Empirical Finance, Elsevier, vol. 64(C), pages 144-159.
    18. Koen W. de Bock, 2017. "The best of two worlds: Balancing model strength and comprehensibility in business failure prediction using spline-rule ensembles," Post-Print hal-01588059, HAL.
    19. Kristopher Gerardi & Kyle F. Herkenhoff & Lee E. Ohanian & Paul S. Willen, 2018. "Can’t Pay or Won’t Pay? Unemployment, Negative Equity, and Strategic Default," The Review of Financial Studies, Society for Financial Studies, vol. 31(3), pages 1098-1131.
    20. Chen, Dangxing & Ye, Jiahui & Ye, Weicheng, 2023. "Interpretable selective learning in credit risk," Research in International Business and Finance, Elsevier, vol. 65(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:249:y:2016:i:2:p:427-439. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.