IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0289130.html
   My bibliography  Save this article

Combination of unsupervised discretization methods for credit risk

Author

Listed:
  • José G Fuentes Cabrera
  • Hugo A Pérez Vicente
  • Sebastián Maldonado
  • Jonás Velasco

Abstract

Creating robust and explainable statistical learning models is essential in credit risk management. For this purpose, equally spaced or frequent discretization is the de facto choice when building predictive models. The methods above have limitations, given that when the discretization procedure is constrained, the underlying patterns are lost. This study introduces an innovative approach by combining traditional discretization techniques with clustering-based discretization, specifically k means and Gaussian mixture models. The study proposes two combinations: Discrete Competitive Combination (DCC) and Discrete Exhaustive Combination (DEC). Discrete Competitive Combination selects features based on the discretization method that performs better on each feature, whereas Discrete Exhaustive Combination includes every discretization method to complement the information not captured by each technique. The proposed combinations were tested on 11 different credit risk datasets by fitting a logistic regression model using the weight of evidence transformation over the training partition and contrasted over the validation partition. The experimental findings showed that both combinations similarly outperform individual methods for the logistic regression without compromising the computational efficiency. More importantly, the proposed method is a feasible and competitive alternative to conventional methods without reducing explainability.

Suggested Citation

  • José G Fuentes Cabrera & Hugo A Pérez Vicente & Sebastián Maldonado & Jonás Velasco, 2023. "Combination of unsupervised discretization methods for credit risk," PLOS ONE, Public Library of Science, vol. 18(11), pages 1-18, November.
  • Handle: RePEc:plo:pone00:0289130
    DOI: 10.1371/journal.pone.0289130
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0289130
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0289130&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0289130?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Nehrebecka Natalia, 2018. "Predicting the Default Risk of Companies. Comparison of Credit Scoring Models: Logit Vs Support Vector Machines," Econometrics. Advances in Applied Data Analysis, Sciendo, vol. 22(2), pages 54-73, June.
    2. Stephen F Weng & Jenna Reps & Joe Kai & Jonathan M Garibaldi & Nadeem Qureshi, 2017. "Can machine-learning improve cardiovascular risk prediction using routine clinical data?," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-14, April.
    3. Natalia Nehrebecka, 2021. "COVID-19: stress-testing non-financial companies: a macroprudential perspective. The experience of Poland," Eurasian Economic Review, Springer;Eurasia Business and Economics Society, vol. 11(2), pages 283-319, June.
    4. Lee, In & Shin, Yong Jae, 2020. "Machine learning for enterprises: Applications, algorithm selection, and challenges," Business Horizons, Elsevier, vol. 63(2), pages 157-170.
    5. Hossein Hassani & Xu Huang & Emmanuel Silva & Mansi Ghodsi, 2020. "Deep Learning and Implementations in Banking," Annals of Data Science, Springer, vol. 7(3), pages 433-446, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wenjun Ke & Yulin Liu & Jiahao Wang & Zhi Fang & Zangbo Chi & Yikai Guo & Rui Wang & Peng Wang, 2024. "DecentralDC: Assessing data contribution under decentralized sharing and exchange blockchain," PLOS ONE, Public Library of Science, vol. 19(10), pages 1-48, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bavaresco, Rodrigo Simon & Nesi, Luan Carlos & Victória Barbosa, Jorge Luis & Antunes, Rodolfo Stoffel & da Rosa Righi, Rodrigo & da Costa, Cristiano André & Vanzin, Mariangela & Dornelles, Daniel & J, 2023. "Machine learning-based automation of accounting services: An exploratory case study," International Journal of Accounting Information Systems, Elsevier, vol. 49(C).
    2. Salvatore Tedesco & Martina Andrulli & Markus Åkerlund Larsson & Daniel Kelly & Antti Alamäki & Suzanne Timmons & John Barton & Joan Condell & Brendan O’Flynn & Anna Nordström, 2021. "Comparison of Machine Learning Techniques for Mortality Prediction in a Prospective Cohort of Older Adults," IJERPH, MDPI, vol. 18(23), pages 1-18, December.
    3. Natalia Nehrebecka, 2021. "Internal Credit Risk Models and Digital Transformation: What to Prepare for? An Application to Poland," European Research Studies Journal, European Research Studies Journal, vol. 0(Special 2), pages 719-736.
    4. Ionut Anica-Popa & Liana Anica-Popa & Cristina Radulescu & Marinela Vrincianu, 2021. "The Integration of Artificial Intelligence in Retail: Benefits, Challenges and a Dedicated Conceptual Framework," The AMFITEATRU ECONOMIC journal, Academy of Economic Studies - Bucharest, Romania, vol. 23(56), pages 120-120, February.
    5. Neubert, Mitchell J. & Montañez, George D., 2020. "Virtue as a framework for the design and use of artificial intelligence," Business Horizons, Elsevier, vol. 63(2), pages 195-204.
    6. Robertson, Jeandri & Botha, Elsamari & Oosthuizen, Kim & Montecchi, Matteo, 2025. "Managing change when integrating artificial intelligence (AI) into the retail value chain: The AI implementation compass," Journal of Business Research, Elsevier, vol. 189(C).
    7. Alina Köchling & Marius Claus Wehner, 2020. "Discriminated by an algorithm: a systematic review of discrimination and fairness by algorithmic decision-making in the context of HR recruitment and HR development," Business Research, Springer;German Academic Association for Business Research, vol. 13(3), pages 795-848, November.
    8. Kamoonpuri, Sana Zehra & Sengar, Anita, 2023. "Hi, May AI help you? An analysis of the barriers impeding the implementation and use of artificial intelligence-enabled virtual assistants in retail," Journal of Retailing and Consumer Services, Elsevier, vol. 72(C).
    9. Ming‐Lang Tseng & Hien Minh Ha & Thi Phuong Thuy Tran & Tat‐Dat Bui & Chih‐Cheng Chen & Chun‐Wei Lin, 2022. "Building a data‐driven circular supply chain hierarchical structure: Resource recovery implementation drives circular business strategy," Business Strategy and the Environment, Wiley Blackwell, vol. 31(5), pages 2082-2106, July.
    10. N Salet & A Gökdemir & J Preijde & C H van Heck & F Eijkenaar, 2024. "Using machine learning to predict acute myocardial infarction and ischemic heart disease in primary care cardiovascular patients," PLOS ONE, Public Library of Science, vol. 19(7), pages 1-17, July.
    11. Ying Wang & Zhicheng Du & Wayne R. Lawrence & Yun Huang & Yu Deng & Yuantao Hao, 2019. "Predicting Hepatitis B Virus Infection Based on Health Examination Data of Community Population," IJERPH, MDPI, vol. 16(23), pages 1-13, December.
    12. Canhoto, Ana Isabel & Clear, Fintan, 2020. "Artificial intelligence and machine learning as business tools: A framework for diagnosing value destruction potential," Business Horizons, Elsevier, vol. 63(2), pages 183-193.
    13. Anderson, Brian S., 2022. "What executives get wrong about statistics: Moving from statistical significance to effect sizes and practical impact," Business Horizons, Elsevier, vol. 65(3), pages 379-388.
    14. Lukasz Prorokowski, 2022. "New definition of default," Bank i Kredyt, Narodowy Bank Polski, vol. 53(5), pages 523-564.
    15. Shelda Sajeev & Stephanie Champion & Alline Beleigoli & Derek Chew & Richard L. Reed & Dianna J. Magliano & Jonathan E. Shaw & Roger L. Milne & Sarah Appleton & Tiffany K. Gill & Anthony Maeder, 2021. "Predicting Australian Adults at High Risk of Cardiovascular Disease Mortality Using Standard Risk Factors and Machine Learning," IJERPH, MDPI, vol. 18(6), pages 1-14, March.
    16. Feng, Cai (Mitsu) & Botha, Elsamari & Pitt, Leyland, 2024. "From HAL to GenAI: Optimizing chatbot impacts with CARE," Business Horizons, Elsevier, vol. 67(5), pages 537-548.
    17. Syed, Tahir Abbas & Aslam, Haris & Bhatti, Zeeshan Ahmed & Mehmood, Fahad & Pahuja, Aseem, 2024. "Dynamic pricing for perishable goods: A data-driven digital transformation approach," International Journal of Production Economics, Elsevier, vol. 277(C).
    18. Alisha Lakra & Shubhkirti Gupta & Ravi Ranjan & Sushanta Tripathy & Deepak Singhal, 2022. "The Significance of Machine Learning in the Manufacturing Sector: An ISM Approach," Logistics, MDPI, vol. 6(4), pages 1-15, October.
    19. Woo Suk Hong & Adrian Daniel Haimovich & R Andrew Taylor, 2018. "Predicting hospital admission at emergency department triage using machine learning," PLOS ONE, Public Library of Science, vol. 13(7), pages 1-13, July.
    20. Rijwan Khan, 2023. "Deep Learning System and It’s Automatic Testing: An Approach," Annals of Data Science, Springer, vol. 10(4), pages 1019-1033, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0289130. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.