IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0117844.html
   My bibliography  Save this article

Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

Author

Listed:
  • Hong Wang
  • Qingsong Xu
  • Lifeng Zhou

Abstract

Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

Suggested Citation

  • Hong Wang & Qingsong Xu & Lifeng Zhou, 2015. "Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-20, February.
  • Handle: RePEc:plo:pone00:0117844
    DOI: 10.1371/journal.pone.0117844
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0117844
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0117844&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0117844?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Wiginton, John C., 1980. "A Note on the Comparison of Logit and Discriminant Models of Consumer Credit Behavior," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 15(3), pages 757-770, September.
    2. Desai, Vijay S. & Crook, Jonathan N. & Overstreet, George A., 1996. "A comparison of neural networks and linear scoring models in the credit union environment," European Journal of Operational Research, Elsevier, vol. 95(1), pages 24-37, November.
    3. D. J. Hand & W. E. Henley, 1997. "Statistical Classification Methods in Consumer Credit Scoring: a Review," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 160(3), pages 523-541, September.
    4. Crone, Sven F. & Finlay, Steven, 2012. "Instance sampling in credit scoring: An empirical study of sample size and balancing," International Journal of Forecasting, Elsevier, vol. 28(1), pages 224-238.
    5. L C Thomas & R W Oliver & D J Hand, 2005. "A survey of the issues in consumer credit modelling research," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 56(9), pages 1006-1015, September.
    6. B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.
    7. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    8. Altman, Edward I. & Marco, Giancarlo & Varetto, Franco, 1994. "Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian experience)," Journal of Banking & Finance, Elsevier, vol. 18(3), pages 505-529, May.
    9. Thomas, Lyn C., 2000. "A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers," International Journal of Forecasting, Elsevier, vol. 16(2), pages 149-172.
    10. Paleologo, Giuseppe & Elisseeff, André & Antonini, Gianluca, 2010. "Subagging for credit scoring models," European Journal of Operational Research, Elsevier, vol. 201(2), pages 490-499, March.
    11. Ang, James S & Chua, Jess H & Bowling, Clinton H, 1979. "The Profiles of Late-Paying Consumer Loan Borrowers: An Exploratory Study: A Note," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 11(2), pages 222-226, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Asoke K Nandi & Kuldeep Kaur Randhawa & Hong Siang Chua & Manjeevan Seera & Chee Peng Lim, 2022. "Credit card fraud detection using a hierarchical behavior-knowledge space model," PLOS ONE, Public Library of Science, vol. 17(1), pages 1-16, January.
    2. Caporin, Massimiliano & Poli, Francesco, 2022. "News and intraday jumps: Evidence from regularization and class imbalance," The North American Journal of Economics and Finance, Elsevier, vol. 62(C).
    3. Martin Leo & Suneel Sharma & K. Maddulety, 2019. "Machine Learning in Banking Risk Management: A Literature Review," Risks, MDPI, vol. 7(1), pages 1-22, March.
    4. Bauer, Kevin & Pfeuffer, Nicolas & Abdel-Karim, Benjamin M. & Hinz, Oliver & Kosfeld, Michael, 2020. "The terminator of social welfare? The economic consequences of algorithmic discrimination," SAFE Working Paper Series 287, Leibniz Institute for Financial Research SAFE.
    5. Sun, Yue & Chai, Nana & Dong, Yizhe & Shi, Baofeng, 2022. "Assessing and predicting small industrial enterprises’ credit ratings: A fuzzy decision-making approach," International Journal of Forecasting, Elsevier, vol. 38(3), pages 1158-1172.
    6. Yajiao Tang & Junkai Ji & Yulin Zhu & Shangce Gao & Zheng Tang & Yuki Todo, 2019. "A Differential Evolution-Oriented Pruning Neural Network Model for Bankruptcy Prediction," Complexity, Hindawi, vol. 2019, pages 1-21, August.
    7. Dimitrios Nikolaidis & Michalis Doumpos, 2022. "Credit Scoring with Drift Adaptation Using Local Regions of Competence," SN Operations Research Forum, Springer, vol. 3(4), pages 1-28, December.
    8. Yasmin Agueda Rios-Solis & Mario Alberto Saucedo-Espinosa & Gabriel Arturo Caballero-Robledo, 2017. "Repayment policy for multiple loans," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-12, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dumitrescu, Elena & Hué, Sullivan & Hurlin, Christophe & Tokpavi, Sessi, 2022. "Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects," European Journal of Operational Research, Elsevier, vol. 297(3), pages 1178-1192.
    2. Hussein A. Abdou & John Pointon, 2011. "Credit Scoring, Statistical Techniques And Evaluation Criteria: A Review Of The Literature," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 18(2-3), pages 59-88, April.
    3. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    4. Huei-Wen Teng & Michael Lee, 2019. "Estimation Procedures of Using Five Alternative Machine Learning Methods for Predicting Credit Card Default," Review of Pacific Basin Financial Markets and Policies (RPBFMP), World Scientific Publishing Co. Pte. Ltd., vol. 22(03), pages 1-27, September.
    5. Crone, Sven F. & Finlay, Steven, 2012. "Instance sampling in credit scoring: An empirical study of sample size and balancing," International Journal of Forecasting, Elsevier, vol. 28(1), pages 224-238.
    6. Carlos Serrano-Cinca & Begoña Gutiérrez-Nieto & Nydia M. Reyes, 2013. "A Social Approach to Microfinance Credit Scoring," Working Papers CEB 13-013, ULB -- Universite Libre de Bruxelles.
    7. Rais Ahmad Itoo & A. Selvarasu & José António Filipe, 2015. "Loan Products and Credit Scoring by Commercial Banks (India)," International Journal of Finance, Insurance and Risk Management, International Journal of Finance, Insurance and Risk Management, vol. 5(1), pages 851-851.
    8. Ahmed Almustfa Hussin Adam Khatir & Marco Bee, 2022. "Machine Learning Models and Data-Balancing Techniques for Credit Scoring: What Is the Best Combination?," Risks, MDPI, vol. 10(9), pages 1-22, August.
    9. G Verstraeten & D Van den Poel, 2005. "The impact of sample bias on consumer credit scoring performance and profitability," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 56(8), pages 981-992, August.
    10. Brad S. Trinkle & Amelia A. Baldwin, 2007. "Interpretable credit model development via artificial neural networks," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 15(3‐4), pages 123-147, July.
    11. L C Thomas, 2010. "Consumer finance: challenges for operational research," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 61(1), pages 41-52, January.
    12. Thomas, Lyn C., 2000. "A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers," International Journal of Forecasting, Elsevier, vol. 16(2), pages 149-172.
    13. José Willer Prado & Valderí Castro Alcântara & Francisval Melo Carvalho & Kelly Carvalho Vieira & Luiz Kennedy Cruz Machado & Dany Flávio Tonelli, 2016. "Multivariate analysis of credit risk and bankruptcy research data: a bibliometric study involving different knowledge fields (1968–2014)," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(3), pages 1007-1029, March.
    14. Finlay, Steven, 2011. "Multiple classifier architectures and their application to credit risk assessment," European Journal of Operational Research, Elsevier, vol. 210(2), pages 368-378, April.
    15. Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
    16. Thomas Wainwright, 2011. "Elite Knowledges: Framing Risk and the Geographies of Credit," Environment and Planning A, , vol. 43(3), pages 650-665, March.
    17. Casado Yusta, Silvia & Nœ–ez Letamendía, Laura & Pacheco Bonrostro, Joaqu’n Antonio, 2018. "Predicting Corporate Failure: The GRASP-LOGIT Model || Predicci—n de la quiebra empresarial: el modelo GRASP-LOGIT," Revista de Métodos Cuantitativos para la Economía y la Empresa = Journal of Quantitative Methods for Economics and Business Administration, Universidad Pablo de Olavide, Department of Quantitative Methods for Economics and Business Administration, vol. 26(1), pages 294-314, Diciembre.
    18. Adnan Dželihodžić & Dženana Đonko & Jasmin Kevrić, 2018. "Improved Credit Scoring Model Based on Bagging Neural Network," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 17(06), pages 1725-1741, November.
    19. Dinh, Thi Huyen Thanh & Kleimeier, Stefanie, 2007. "A credit scoring model for Vietnam's retail banking market," International Review of Financial Analysis, Elsevier, vol. 16(5), pages 471-495.
    20. Ha-Thu Nguyen, 2015. "How is credit scoring used to predict default in China?," EconomiX Working Papers 2015-1, University of Paris Nanterre, EconomiX.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0117844. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.