IDEAS home Printed from https://ideas.repec.org/h/spr/sprchp/978-3-031-12402-0_3.html
   My bibliography  Save this book chapter

Interpretability via Random Forests

In: Interpretability for Industry 4.0 : Statistical and Machine Learning Approaches

Author

Listed:
  • Clément Bénard

    (Digital Sciences & Technologies, Safran Tech
    CNRS, LPSM, Sorbonne Université)

  • Sébastien Da Veiga

    (Digital Sciences & Technologies, Safran Tech)

  • Erwan Scornet

    (Institut Polytechnique de Paris, CMAP, École Polytechnique)

Abstract

Although there is no consensus on a precise definition of interpretability, it is possible to identify several requirements: “simplicity, stability, and accuracy”, rarely all satisfied by existing interpretable methods. The structure and stability of random forests make them good candidates to improve the performance of interpretable algorithms. The first part of this chapter focuses on rule learning models, which are simple and highly predictive algorithms, but very often unstable with respect to small data perturbations. A new algorithm called SIRUS, designed as the extraction of a compact rule ensemble from a random forest, considerably improves stability over state-of-the-art competitors, while preserving simplicity and accuracy. The second part of this chapter is dedicated to post-hoc methods, in particular variable importance measures for random forests. An asymptotic analysis of Breiman’s MDA (Mean Decrease Accuracy) shows that this measure is strongly biased using a sensitivity analysis perspective. The Sobol-MDA algorithm is introduced to fix the MDA flaws, replacing permutations by projections. An extension to Shapley effects, an efficient importance measure when input variables are dependent, is then proposed with the SHAFF algorithm.

Suggested Citation

  • Clément Bénard & Sébastien Da Veiga & Erwan Scornet, 2022. "Interpretability via Random Forests," Springer Books, in: Antonio Lepore & Biagio Palumbo & Jean-Michel Poggi (ed.), Interpretability for Industry 4.0 : Statistical and Machine Learning Approaches, chapter 0, pages 37-84, Springer.
  • Handle: RePEc:spr:sprchp:978-3-031-12402-0_3
    DOI: 10.1007/978-3-031-12402-0_3
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a
    for a similarly titled item that would be available.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:sprchp:978-3-031-12402-0_3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.