IDEAS home Printed from https://ideas.repec.org/a/eee/soceps/v103y2026ics0038012125002150.html

Prediction of the gender inequality index based on data-driven interpretable ensemble learning methods

Author

Listed:
  • Özdemir, Mehmet Hakan
  • Aylak, Batin Latif
  • Cakiroglu, Celal
  • Bağcı, Mahmut

Abstract

Gender inequality is acknowledged as a major hindrance to human development, evident in multiple social, political, economic, and cultural aspects. Therefore, identifying the factors contributing to gender inequality and quantifying them is crucial for enhancing societal progress. A new index, the gender inequality index (GII), was introduced in the 2010 Human Development Report to quantify and compare gender inequalities among different countries. Multiple indicators are used to calculate the GII, which involves complex analytical calculations. This study utilizes these indicators as input features to predict the GII using XGBoost, CatBoost, Extra Trees, LightGBM, Ridge, and Lasso regression models. These regressors are trained for predicting the GII as a function of maternal mortality ratio, adolescent birth rate, share of seats in parliament, female population with at least some secondary education, male population with at least some secondary education, female labour force participation rate, and male labour force participation rate. It is observed that XGBoost, CatBoost, Extra Trees and LightGBM predictors have R2 score greater than 0.98, while the Ridge and Lasso regressors have R2 score less than 0.90. The highest average accuracy is obtained by the CatBoost model while the XGBoost model has the greatest computational speed. Furthermore, the Shapley additive explanations methodology is utilized to detect the impact of different input features on the model predictions, and this information allows for more precise calculation of the GII. Thus, the proposed machine learning procedure enables both simplicity and flexibility for the GII prediction and provides more effective use of the GII.

Suggested Citation

  • Özdemir, Mehmet Hakan & Aylak, Batin Latif & Cakiroglu, Celal & Bağcı, Mahmut, 2026. "Prediction of the gender inequality index based on data-driven interpretable ensemble learning methods," Socio-Economic Planning Sciences, Elsevier, vol. 103(C).
  • Handle: RePEc:eee:soceps:v:103:y:2026:i:c:s0038012125002150
    DOI: 10.1016/j.seps.2025.102366
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0038012125002150
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.seps.2025.102366?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:soceps:v:103:y:2026:i:c:s0038012125002150. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/seps .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.