IDEAS home Printed from https://ideas.repec.org/a/spr/digfin/v7y2025i4d10.1007_s42521-025-00135-6.html
   My bibliography  Save this article

Partial dependence analysis of financial ratios in predicting company defaults: random forest vs XGBoost models

Author

Listed:
  • Monia Antar

    (Amercian University in the Emirates
    University of Tunis)

  • Tahar Tayachi

    (Amercian University in the Emirates
    University of Monastir)

Abstract

In this paper, we investigate using machine learning models to predict credit defaults using financial ratios. We compare the performance and interpretability of two ensemble learning algorithms: Random Forest and XGBoost. To improve the models' capability to detect defaults, we exploit the inherent class imbalance of default prediction tasks with the ROSE (Random Over-Sampling Examples) technique to balance the dataset. Both models are trained on imbalanced and balanced datasets. We used Accuracy, Sensitivity, Specificity, F1 Score, and AUC (Area Under the ROC curve) to evaluate the models' performances. we validate model performance using Rank Graduation Accuracy (RGA) to assess ranking consistency, revealing superior predictive power on imbalanced data (RGA = 0.991–0.993) versus balanced distributions (RGA = 0.959–0.965). Contrary to oversampling orthodoxy, ROSE balancing degraded performance aligning with theoretical critiques of synthetic data in mature classifiers. We also interpret the models by calculating feature importance using Shapley-Lorenz values. Partial Dependence Plots (PDPs) help to visualize how key financial ratios impact the predicted probability of default. Results show non-linear relationships between key financial ratios, such as Return on Assets (R6), Debt to Equity Ratio (R8), and default risk. The key features shown are similar for Random Forest and XGBoost, though the interpretation of the feature importance differs slightly. To enhance the robustness and credibility of our feature effect analysis, we conducted ALE (Accumulated Local Effects) plots as they provide a more robust framwork that accounts for feature interractions. This study advances credit default prediction in Tunisia's banking sector by enhancing interpretability through Accumulated Local Effects analysis alongside Partial Dependence Plots, providing robust insights into feature effects, particularly for key financial. Results offer insights about important ratio thresholds and their impact on default probability prediction, such as the sharp drop in default risk when R6 becomes positive. These advancements provide regulators and financial institutions with more reliable tools for credit risk assessment in Tunisia's economic context, bridging the gap between sophisticated machine learning techniques and practical, interpretable financial decision-making.

Suggested Citation

  • Monia Antar & Tahar Tayachi, 2025. "Partial dependence analysis of financial ratios in predicting company defaults: random forest vs XGBoost models," Digital Finance, Springer, vol. 7(4), pages 997-1012, December.
  • Handle: RePEc:spr:digfin:v:7:y:2025:i:4:d:10.1007_s42521-025-00135-6
    DOI: 10.1007/s42521-025-00135-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s42521-025-00135-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s42521-025-00135-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • G32 - Financial Economics - - Corporate Finance and Governance - - - Financing Policy; Financial Risk and Risk Management; Capital and Ownership Structure; Value of Firms; Goodwill
    • G33 - Financial Economics - - Corporate Finance and Governance - - - Bankruptcy; Liquidation
    • M41 - Business Administration and Business Economics; Marketing; Accounting; Personnel Economics - - Accounting - - - Accounting

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:digfin:v:7:y:2025:i:4:d:10.1007_s42521-025-00135-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.