IDEAS home Printed from https://ideas.repec.org/p/hal/journl/hal-03331114.html

Machine Learning for Credit Scoring: Improving Logistic Regression with Non Linear Decision Tree Effects

Author

Listed:
  • Elena Ivona Dumitrescu

    (EconomiX - EconomiX - UPN - Université Paris Nanterre - CNRS - Centre National de la Recherche Scientifique)

  • Sullivan Hué

    (LEO - Laboratoire d'Économie d'Orleans [2021-2022] - UO - Université d'Orléans - UT - Université de Tours)

  • Christophe Hurlin

    (LEO - Laboratoire d'Économie d'Orleans [2021-2022] - UO - Université d'Orléans - UT - Université de Tours)

  • Sessi Tokpavi

    (LEO - Laboratoire d'Économie d'Orleans [2021-2022] - UO - Université d'Orléans - UT - Université de Tours)

Abstract

In the context of credit scoring, ensemble methods based on decision trees, such as the random forest method, provide better classification performance than standard logistic regression models. However, logistic regression remains the benchmark in the credit risk industry mainly because the lack of interpretability of ensemble methods is incompatible with the requirements of financial regulators. In this paper, we propose a high-performance and interpretable credit scoring method called penalised logistic tree regression (PLTR), which uses information from decision trees to improve the performance of logistic regression. Formally, rules extracted from various short-depth decision trees built with original predictive variables are used as predictors in a penalised logistic regression model. PLTR allows us to capture non-linear effects that can arise in credit scoring data while preserving the intrinsic interpretability of the logistic regression model. Monte Carlo simulations and empirical applications using four real credit default datasets show that PLTR predicts credit risk significantly more accurately than logistic regression and compares competitively to the random forest method

Suggested Citation

  • Elena Ivona Dumitrescu & Sullivan Hué & Christophe Hurlin & Sessi Tokpavi, 2022. "Machine Learning for Credit Scoring: Improving Logistic Regression with Non Linear Decision Tree Effects," Post-Print hal-03331114, HAL.
  • Handle: RePEc:hal:journl:hal-03331114
    DOI: 10.1016/j.ejor.2021.06.053
    Note: View the original document on HAL open archive server: https://hal.science/hal-03331114v1
    as

    Download full text from publisher

    File URL: https://hal.science/hal-03331114v1/document
    Download Restriction: no

    File URL: https://libkey.io/10.1016/j.ejor.2021.06.053?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Other versions of this item:

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:journl:hal-03331114. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.