IDEAS home Printed from https://ideas.repec.org/p/hal/wpaper/hal-05302147.html

Statistical Spatial Interpretable Machine Learning in R Using Tree Ensembles and SHAP Values

Author

Listed:
  • Mehmet Güney Celbiş

    (LAET - Laboratoire Aménagement Économie Transports - UL2 - Université Lumière - Lyon 2 - ENTPE - École Nationale des Travaux Publics de l'État - CNRS - Centre National de la Recherche Scientifique)

  • Louafi Bouzouina

    (LAET - Laboratoire Aménagement Économie Transports - UL2 - Université Lumière - Lyon 2 - ENTPE - École Nationale des Travaux Publics de l'État - CNRS - Centre National de la Recherche Scientifique)

Abstract

This handbook chapter aims to present, discuss, and explore the uses of statistical machine learning algorithms and interpretable machine learning tools in the context of spatial analysis. In this regard, the chapter is mostly aimed towards researchers and practitioners in urban and regional spatial analysis, and the field of regional science in general. Using a relatively simple dataset downloadable as part of an R package, the chapter applies a series tree-based machine learning models – with XGBoost being the primary one, and analyzes the results using SHAP values. The use of spatial features, spatial cross-validation, and spatial dependence are focal topics. The use of coordinates, spatially lagged features, and their consequences on predictions are investigated by taking into account potential data leakage caused by proximities over space of data instances in calibration and validation sets. The chapter demonstrates the advantages of the used techniques for spatial analysis while highlighting the possible drawbacks of internalizing spatial information into machine learning models. In doing so, models predicting urban noise levels are employed.

Suggested Citation

  • Mehmet Güney Celbiş & Louafi Bouzouina, 2025. "Statistical Spatial Interpretable Machine Learning in R Using Tree Ensembles and SHAP Values," Working Papers hal-05302147, HAL.
  • Handle: RePEc:hal:wpaper:hal-05302147
    Note: View the original document on HAL open archive server: https://hal.science/hal-05302147v1
    as

    Download full text from publisher

    File URL: https://hal.science/hal-05302147v1/document
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:wpaper:hal-05302147. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.