IDEAS home Printed from https://ideas.repec.org/a/bla/jorssa/v183y2020i2p631-654.html
   My bibliography  Save this article

Inferring the outcomes of rejected loans: an application of semisupervised clustering

Author

Listed:
  • Zhiyong Li
  • Xinyi Hu
  • Ke Li
  • Fanyin Zhou
  • Feng Shen

Abstract

Rejection inference aims to reduce sample bias and to improve model performance in credit scoring. We propose a semisupervised clustering approach as a new rejection inference technique. K‐prototype clustering can deal with mixed types of numeric and categorical characteristics, which are common in consumer credit data. We identify homogeneous acceptances and rejections and assign labels to part of the rejections according to the label of acceptances. We test the performance of various rejection inference methods in logit, support vector machine and random‐forests models based on data sets of real consumer loans. The predictions of clustering rejection inference show advantages over other traditional rejection inference methods. Inferring the label of the rejection from semisupervised clustering is found to help to mitigate the sample bias problem and to improve the predictive accuracy.

Suggested Citation

  • Zhiyong Li & Xinyi Hu & Ke Li & Fanyin Zhou & Feng Shen, 2020. "Inferring the outcomes of rejected loans: an application of semisupervised clustering," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(2), pages 631-654, February.
  • Handle: RePEc:bla:jorssa:v:183:y:2020:i:2:p:631-654
    DOI: 10.1111/rssa.12534
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssa.12534
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssa.12534?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Anderson, Raymond, 2007. "The Credit Scoring Toolkit: Theory and Practice for Retail Credit Risk Management and Decision Automation," OUP Catalogue, Oxford University Press, number 9780199226405, Decembrie.
    2. Guo, Yanhong & Zhou, Wenjun & Luo, Chunyu & Liu, Chuanren & Xiong, Hui, 2016. "Instance-based credit risk assessment for investment decisions in P2P lending," European Journal of Operational Research, Elsevier, vol. 249(2), pages 417-426.
    3. D. J. Hand & W. E. Henley, 1997. "Statistical Classification Methods in Consumer Credit Scoring: a Review," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 160(3), pages 523-541, September.
    4. G G Chen & T Åstebro, 2012. "Bound and collapse Bayesian reject inference for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 63(10), pages 1374-1387, October.
    5. David Hand & Niall Adams, 2000. "Defining attributes for scorecard construction in credit scoring," Journal of Applied Statistics, Taylor & Francis Journals, vol. 27(5), pages 527-540.
    6. David J Hand & Niall M Adams, 2014. "Selection bias in credit scorecard evaluation," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 65(3), pages 408-415, March.
    7. J Banasik & J Crook & L Thomas, 2003. "Sample selection bias in credit scoring models," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(8), pages 822-832, August.
    8. Martin, Daniel, 1977. "Early warning of bank failure : A logit regression approach," Journal of Banking & Finance, Elsevier, vol. 1(3), pages 249-276, November.
    9. Yi Peng & Yong Zhang & Gang Kou & Yong Shi, 2012. "A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set," PLOS ONE, Public Library of Science, vol. 7(7), pages 1-9, July.
    10. Jones, Stewart & Johnstone, David & Wilson, Roy, 2015. "An empirical evaluation of the performance of binary classifiers in the prediction of credit ratings changes," Journal of Banking & Finance, Elsevier, vol. 56(C), pages 72-85.
    11. Bücker, Michael & van Kampen, Maarten & Krämer, Walter, 2013. "Reject inference in consumer credit scoring with nonignorable missing data," Journal of Banking & Finance, Elsevier, vol. 37(3), pages 1040-1045.
    12. Steven Finlay, 2012. "Credit Scoring, Response Modeling, and Insurance Rating," Palgrave Macmillan Books, Palgrave Macmillan, edition 0, number 978-1-137-03169-3, December.
    13. Jonathan Crook & Tony Bellotti, 2010. "Time varying and dynamic models for default risk in consumer loans," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 173(2), pages 283-305, April.
    14. Crook, Jonathan & Banasik, John, 2004. "Does reject inference really improve the performance of application scoring models?," Journal of Banking & Finance, Elsevier, vol. 28(4), pages 857-874, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rogelio A. Mancisidor & Michael Kampffmeyer & Kjersti Aas & Robert Jenssen, 2019. "Deep Generative Models for Reject Inference in Credit Scoring," Papers 1904.11376, arXiv.org, revised Sep 2021.
    2. Ha-Thu Nguyen, 2016. "Reject inference in application scorecards: evidence from France," EconomiX Working Papers 2016-10, University of Paris Nanterre, EconomiX.
    3. Ha Thu Nguyen, 2016. "Reject inference in application scorecards: evidence from France," Working Papers hal-04141601, HAL.
    4. Gero Szepannek, 2022. "An Overview on the Landscape of R Packages for Open Source Scorecard Modelling," Risks, MDPI, vol. 10(3), pages 1-33, March.
    5. Monir El Annas & Badreddine Benyacoub & Mohamed Ouzineb, 2023. "Semi-supervised adapted HMMs for P2P credit scoring systems with reject inference," Computational Statistics, Springer, vol. 38(1), pages 149-169, March.
    6. Ha-Thu Nguyen, 2015. "How is credit scoring used to predict default in China?," EconomiX Working Papers 2015-1, University of Paris Nanterre, EconomiX.
    7. Evžen Kocenda & Martin Vojtek, 2011. "Default Predictors in Retail Credit Scoring: Evidence from Czech Banking Data," Emerging Markets Finance and Trade, Taylor & Francis Journals, vol. 47(6), pages 80-98, November.
    8. Silva, Diego M.B. & Pereira, Gustavo H.A. & Magalhães, Tiago M., 2022. "A class of categorization methods for credit scoring models," European Journal of Operational Research, Elsevier, vol. 296(1), pages 323-331.
    9. Hussein A. Abdou & John Pointon, 2011. "Credit Scoring, Statistical Techniques And Evaluation Criteria: A Review Of The Literature," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 18(2-3), pages 59-88, April.
    10. Dimitris Andriosopoulos & Michalis Doumpos & Panos M. Pardalos & Constantin Zopounidis, 2019. "Computational approaches and data analytics in financial services: A literature review," Journal of the Operational Research Society, Taylor & Francis Journals, vol. 70(10), pages 1581-1599, October.
    11. Andrea Bedin & Monica Billio & Michele Costola & Loriana Pelizzon, 2019. "Credit Scoring in SME Asset-Backed Securities: An Italian Case Study," JRFM, MDPI, vol. 12(2), pages 1-28, May.
    12. Crook, Jonathan N. & Edelman, David B. & Thomas, Lyn C., 2007. "Recent developments in consumer credit risk assessment," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1447-1465, December.
    13. Dimitrios Nikolaidis & Michalis Doumpos, 2022. "Credit Scoring with Drift Adaptation Using Local Regions of Competence," SN Operations Research Forum, Springer, vol. 3(4), pages 1-28, December.
    14. Maria Rocha Sousa & João Gama & Elísio Brandão, 2013. "Introducing time-changing economics into credit scoring," FEP Working Papers 513, Universidade do Porto, Faculdade de Economia do Porto.
    15. Ha Thu Nguyen, 2015. "How is credit scoring used to predict default in China?," Working Papers hal-04133309, HAL.
    16. Mengnan Song & Jiasong Wang & Suisui Su, 2022. "Towards a Better Microcredit Decision," Papers 2209.07574, arXiv.org.
    17. Crone, Sven F. & Finlay, Steven, 2012. "Instance sampling in credit scoring: An empirical study of sample size and balancing," International Journal of Forecasting, Elsevier, vol. 28(1), pages 224-238.
    18. Kiefer, Nicholas M. & Larson, C. Erik, 2006. "Specification and Informational Issues in Credit Scoring," Working Papers 06-11, Cornell University, Center for Analytic Economics.
    19. J Banasik & J Crook, 2010. "Reject inference in survival analysis by augmentation," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 61(3), pages 473-485, March.
    20. Dong-Her Shih & Ting-Wei Wu & Po-Yuan Shih & Nai-An Lu & Ming-Hung Shih, 2022. "A Framework of Global Credit-Scoring Modeling Using Outlier Detection and Machine Learning in a P2P Lending Platform," Mathematics, MDPI, vol. 10(13), pages 1-13, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:183:y:2020:i:2:p:631-654. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.