IDEAS home Printed from https://ideas.repec.org/a/eee/intfor/v41y2025i3p920-939.html

A semi-supervised reject inference framework with hierarchical heterogeneous networks for credit scoring

Author

Listed:
  • Chen, Liao
  • Jia, Ning
  • Jiao, Zhixian
  • Zhao, Hongke
  • Cui, Runbang
  • Wang, Huimin

Abstract

Credit scoring is a popular tool for loan assessment, i.e., deciding whether to accept or reject a loan application. Traditional research into learning for credit scoring has only applied historically accepted samples without rejected applicants whose true repayment performance is absent, thereby causing both sample selection bias and wasting data. Some methods have been proposed for inferring rejected samples but they are still affected by several open problems, especially for medium- and long-term loan applications with a higher rejection rate. In particular, the heterogeneous relationships between accepted and rejected applications have not been well studied. Moreover, the complex repayment behaviors resulting from long repayment terms may lead to poor learning performance. Thus, we propose a reject inference framework with Semi-supervised Hierarchical Heterogeneous Network (S2HN) for credit scoring. We introduce a hierarchical heterogeneous network for revealing the complex connections between accepted and rejected applications, and use prospective heterogeneous repayment patterns as auxiliary information through clustering and a two-layer prediction architecture. Extensive experiments conducted based on real-world data sets demonstrated the effectiveness of our proposed method.

Suggested Citation

  • Chen, Liao & Jia, Ning & Jiao, Zhixian & Zhao, Hongke & Cui, Runbang & Wang, Huimin, 2025. "A semi-supervised reject inference framework with hierarchical heterogeneous networks for credit scoring," International Journal of Forecasting, Elsevier, vol. 41(3), pages 920-939.
  • Handle: RePEc:eee:intfor:v:41:y:2025:i:3:p:920-939
    DOI: 10.1016/j.ijforecast.2024.07.011
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0169207024000785
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ijforecast.2024.07.011?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. D. J. Hand & W. E. Henley, 1997. "Statistical Classification Methods in Consumer Credit Scoring: a Review," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 160(3), pages 523-541, September.
    2. Cuiqing Jiang & Zhao Wang & Ruiya Wang & Yong Ding, 2018. "Loan default prediction by combining soft information extracted from descriptive text in online peer-to-peer lending," Annals of Operations Research, Springer, vol. 266(1), pages 511-529, July.
    3. Xia, Yufei & Zhao, Junhao & He, Lingyun & Li, Yinguo & Yang, Xiaoli, 2021. "Forecasting loss given default for peer-to-peer loans via heterogeneous stacking ensemble approach," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1590-1613.
    4. Lessmann, Stefan & Baesens, Bart & Seow, Hsin-Vonn & Thomas, Lyn C., 2015. "Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research," European Journal of Operational Research, Elsevier, vol. 247(1), pages 124-136.
    5. Zhiyong Li & Xinyi Hu & Ke Li & Fanyin Zhou & Feng Shen, 2020. "Inferring the outcomes of rejected loans: an application of semisupervised clustering," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(2), pages 631-654, February.
    6. Marco Cuturi & Mathieu Blondel, 2017. "Soft-DTW: a Differentiable Loss Function for Time-Series," Working Papers 2017-81, Center for Research in Economics and Statistics.
    7. Hongke Zhao & Chuang Zhao & Xi Zhang & Nanlin Liu & Hengshu Zhu & Qi Liu & Hui Xiong, 2023. "An Ensemble Learning Approach with Gradient Resampling for Class-Imbalance Problems," INFORMS Journal on Computing, INFORMS, vol. 35(4), pages 747-763, July.
    8. B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.
    9. Banasik, John & Crook, Jonathan, 2007. "Reject inference, augmentation, and sample selection," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1582-1594, December.
    10. J Banasik & J Crook & L Thomas, 2003. "Sample selection bias in credit scoring models," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(8), pages 822-832, August.
    11. A.J. Feelders, 2000. "Credit scoring and reject inference with mixture models," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 9(1), pages 1-8, March.
    12. Crook, Jonathan & Banasik, John, 2004. "Does reject inference really improve the performance of application scoring models?," Journal of Banking & Finance, Elsevier, vol. 28(4), pages 857-874, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ha Thu Nguyen, 2016. "Reject inference in application scorecards: evidence from France," Working Papers hal-04141601, HAL.
    2. Kozodoi, Nikita & Lessmann, Stefan & Alamgir, Morteza & Moreira-Matias, Luis & Papakonstantinou, Konstantinos, 2025. "Fighting sampling bias: A framework for training and evaluating credit scoring models," European Journal of Operational Research, Elsevier, vol. 324(2), pages 616-628.
    3. Monir El Annas & Badreddine Benyacoub & Mohamed Ouzineb, 2023. "Semi-supervised adapted HMMs for P2P credit scoring systems with reject inference," Computational Statistics, Springer, vol. 38(1), pages 149-169, March.
    4. Mengnan Song & Jiasong Wang & Suisui Su, 2022. "Towards a Better Microcredit Decision," Papers 2209.07574, arXiv.org.
    5. Calabrese, Raffaella & Osmetti, Silvia Angela & Zanin, Luca, 2024. "Sample selection bias in non-traditional lending: A copula-based approach for imbalanced data," Socio-Economic Planning Sciences, Elsevier, vol. 95(C).
    6. Dangxing Chen & Weicheng Ye & Jiahui Ye, 2022. "Interpretable Selective Learning in Credit Risk," Papers 2209.10127, arXiv.org.
    7. Zhiyong Li & Xinyi Hu & Ke Li & Fanyin Zhou & Feng Shen, 2020. "Inferring the outcomes of rejected loans: an application of semisupervised clustering," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(2), pages 631-654, February.
    8. Crone, Sven F. & Finlay, Steven, 2012. "Instance sampling in credit scoring: An empirical study of sample size and balancing," International Journal of Forecasting, Elsevier, vol. 28(1), pages 224-238.
    9. Nadia Ayed & Khemaies Bougatef, 2024. "Performance Assessment of Logistic Regression (LR), Artificial Neural Network (ANN), Fuzzy Inference System (FIS) and Adaptive Neuro-Fuzzy System (ANFIS) in Predicting Default Probability: The Case of," Computational Economics, Springer;Society for Computational Economics, vol. 64(3), pages 1803-1835, September.
    10. J Banasik & J Crook, 2010. "Reject inference in survival analysis by augmentation," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 61(3), pages 473-485, March.
    11. Teply, Petr & Polena, Michal, 2020. "Best classification algorithms in peer-to-peer lending," The North American Journal of Economics and Finance, Elsevier, vol. 51(C).
    12. Ha-Thu Nguyen, 2016. "Reject inference in application scorecards: evidence from France," EconomiX Working Papers 2016-10, University of Paris Nanterre, EconomiX.
    13. Baesens, Bart & Smedts, Kristien, 2025. "Boosting credit risk models," The British Accounting Review, Elsevier, vol. 57(4).
    14. Juan Laborda & Seyong Ryoo, 2021. "Feature Selection in a Credit Scoring Model," Mathematics, MDPI, vol. 9(7), pages 1-22, March.
    15. Hussein A. Abdou & John Pointon, 2011. "Credit Scoring, Statistical Techniques And Evaluation Criteria: A Review Of The Literature," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 18(2-3), pages 59-88, April.
    16. Jiang, Cuiqing & Wang, Zhao & Zhao, Huimin, 2019. "A prediction-driven mixture cure model and its application in credit scoring," European Journal of Operational Research, Elsevier, vol. 277(1), pages 20-31.
    17. Crook, Jonathan N. & Edelman, David B. & Thomas, Lyn C., 2007. "Recent developments in consumer credit risk assessment," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1447-1465, December.
    18. Büşra Alma Çallı & Erman Coşkun, 2021. "A Longitudinal Systematic Review of Credit Risk Assessment and Credit Default Predictors," SAGE Open, , vol. 11(4), pages 21582440211, November.
    19. Huei-Wen Teng & Michael Lee, 2019. "Estimation Procedures of Using Five Alternative Machine Learning Methods for Predicting Credit Card Default," Review of Pacific Basin Financial Markets and Policies (RPBFMP), World Scientific Publishing Co. Pte. Ltd., vol. 22(03), pages 1-27, September.
    20. Liu, Wanan & Fan, Hong & Xia, Meng, 2023. "Tree-based heterogeneous cascade ensemble model for credit scoring," International Journal of Forecasting, Elsevier, vol. 39(4), pages 1593-1614.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:intfor:v:41:y:2025:i:3:p:920-939. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/ijforecast .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.