IDEAS home Printed from https://ideas.repec.org/p/fip/fedkrw/93596.html
   My bibliography  Save this paper

Explaining Machine Learning by Bootstrapping Partial Dependence Functions and Shapley Values

Author

Abstract

Machine learning and artificial intelligence methods are often referred to as “black boxes” when compared with traditional regression-based approaches. However, both traditional and machine learning methods are concerned with modeling the joint distribution between endogenous (target) and exogenous (input) variables. Where linear models describe the fitted relationship between the target and input variables via the slope of that relationship (coefficient estimates), the same fitted relationship can be described rigorously for any machine learning model by first-differencing the partial dependence functions. Bootstrapping these first-differenced functionals provides standard errors and confidence intervals for the estimated relationships. We show that this approach replicates the point estimates of OLS coefficients and demonstrate how this generalizes to marginal relationships in machine learning and artificial intelligence models. We further discuss the relationship of partial dependence functions to Shapley value decompositions and explore how they can be used to further explain model outputs.

Suggested Citation

  • Thomas R. Cook & Greg Gupton & Zach Modig & Nathan M. Palmer, 2021. "Explaining Machine Learning by Bootstrapping Partial Dependence Functions and Shapley Values," Research Working Paper RWP 21-12, Federal Reserve Bank of Kansas City.
  • Handle: RePEc:fip:fedkrw:93596
    DOI: 10.18651/RWP2021-12
    as

    Download full text from publisher

    File URL: https://www.kansascityfed.org/documents/8518/rwp21-12cookguptonmodigpalmer.pdf
    File Function: Full text
    Download Restriction: no

    File URL: https://libkey.io/10.18651/RWP2021-12?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Limsombunchai, Visit, 2004. "House Price Prediction: Hedonic Price Model vs. Artificial Neural Network," 2004 Conference, June 25-26, 2004, Blenheim, New Zealand 97781, New Zealand Agricultural and Resource Economics Society.
    2. Athey, Susan & Imbens, Guido W., 2019. "Machine Learning Methods Economists Should Know About," Research Papers 3776, Stanford University, Graduate School of Business.
    3. Marianne Bertrand & Sendhil Mullainathan, 2004. "Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination," American Economic Review, American Economic Association, vol. 94(4), pages 991-1013, September.
    4. Daniel P. McMillen & Christian L. Redfearn, 2010. "Estimation And Hypothesis Testing For Nonparametric Hedonic House Price Functions," Journal of Regional Science, Wiley Blackwell, vol. 50(3), pages 712-733, August.
    5. Daniel W. Apley & Jingyu Zhu, 2020. "Visualizing the effects of predictor variables in black box supervised learning models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(4), pages 1059-1086, September.
    6. Michael J. Hanmer & Kerem Ozan Kalkan, 2013. "Behind the Curve: Clarifying the Best Approach to Calculating Predicted Probabilities and Marginal Effects from Limited Dependent Variable Models," American Journal of Political Science, John Wiley & Sons, vol. 57(1), pages 263-277, January.
    7. Susan Athey & Guido W. Imbens, 2019. "Machine Learning Methods That Economists Should Know About," Annual Review of Economics, Annual Reviews, vol. 11(1), pages 685-725, August.
    8. W.J. McCluskey & M. McCord & P.T. Davis & M. Haran & D. McIlhatton, 2013. "Prediction accuracy in mass appraisal: a comparison of modern approaches," Journal of Property Research, Taylor & Francis Journals, vol. 30(4), pages 239-265, December.
    9. Joachim Zietz & Emily Zietz & G. Sirmans, 2008. "Determinants of House Prices: A Quantile Regression Approach," The Journal of Real Estate Finance and Economics, Springer, vol. 37(4), pages 317-333, November.
    10. Qingyuan Zhao & Trevor Hastie, 2021. "Causal Interpretations of Black-Box Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 39(1), pages 272-281, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Thomas R. Cook & Nathan M. Palmer, 2023. "Understanding Models and Model Bias with Gaussian Processes," Research Working Paper RWP 23-07, Federal Reserve Bank of Kansas City.
    2. repec:fip:fedkrr:96511 is not listed on IDEAS

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jose Torres-Pruñonosa & Pablo García-Estévez & Josep Maria Raya & Camilo Prado-Román, 2022. "How on Earth Did Spanish Banking Sell the Housing Stock?," SAGE Open, , vol. 12(1), pages 21582440221, March.
    2. Julien Chevallier & Dominique Guégan & Stéphane Goutte, 2021. "Is It Possible to Forecast the Price of Bitcoin?," Forecasting, MDPI, vol. 3(2), pages 1-44, May.
    3. Islam, Towhidul & Meade, Nigel & Carson, Richard T. & Louviere, Jordan J. & Wang, Juan, 2022. "The usefulness of socio-demographic variables in predicting purchase decisions: Evidence from machine learning procedures," Journal of Business Research, Elsevier, vol. 151(C), pages 324-338.
    4. Sophie-Charlotte Klose & Johannes Lederer, 2020. "A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics," Papers 2006.12296, arXiv.org, revised Jun 2020.
    5. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    6. Ay, Jean-Sauveur & Le Gallo, Julie, 2021. "The Signaling Values of Nested Wine Names," Working Papers 321851, American Association of Wine Economists.
    7. Chen, Ruoyu & Jiang, Hanchen & Quintero, Luis E., 2023. "Measuring the value of rent stabilization and understanding its implications for racial inequality: Evidence from New York City," Regional Science and Urban Economics, Elsevier, vol. 103(C).
    8. Dangxing Chen & Luyao Zhang, 2023. "Monotonicity for AI ethics and society: An empirical study of the monotonic neural additive model in criminology, education, health care, and finance," Papers 2301.07060, arXiv.org.
    9. Ballestar, María Teresa & Mir, Miguel Cuerdo & Pedrera, Luis Miguel Doncel & Sainz, Jorge, 2024. "Effectiveness of tutoring at school: A machine learning evaluation," Technological Forecasting and Social Change, Elsevier, vol. 199(C).
    10. Daniel Levy & Tamir Mayer & Alon Raviv, 2020. "Academic Scholarship in Light of the 2008 Financial Crisis: Textual Analysis of NBER Working Papers," Working Papers hal-02488796, HAL.
    11. Combes, Pierre-Philippe & Gobillon, Laurent & Zylberberg, Yanos, 2022. "Urban economics in a historical perspective: Recovering data with machine learning," Regional Science and Urban Economics, Elsevier, vol. 94(C).
    12. Barzin,Samira & Avner,Paolo & Maruyama Rentschler,Jun Erik & O’Clery,Neave, 2022. "Where Are All the Jobs ? A Machine Learning Approach for High Resolution Urban Employment Prediction inDeveloping Countries," Policy Research Working Paper Series 9979, The World Bank.
    13. Arenas, Andreu & Calsamiglia, Caterina, 2022. "Gender Differences in High-Stakes Performance and College Admission Policies," IZA Discussion Papers 15550, Institute of Labor Economics (IZA).
    14. Tsang, Andrew, 2021. "Uncovering Heterogeneous Regional Impacts of Chinese Monetary Policy," MPRA Paper 110703, University Library of Munich, Germany.
    15. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    16. Daniel Goller, 2023. "Analysing a built-in advantage in asymmetric darts contests using causal machine learning," Annals of Operations Research, Springer, vol. 325(1), pages 649-679, June.
    17. Doumpos, Michalis & Zopounidis, Constantin & Gounopoulos, Dimitrios & Platanakis, Emmanouil & Zhang, Wenke, 2023. "Operational research and artificial intelligence methods in banking," European Journal of Operational Research, Elsevier, vol. 306(1), pages 1-16.
    18. Hannes Wallimann & Silvio Sticher, 2023. "On suspicious tracks: machine-learning based approaches to detect cartels in railway-infrastructure procurement," Papers 2304.11888, arXiv.org.
    19. Rodríguez-Vargas, Adolfo, 2020. "Forecasting Costa Rican inflation with machine learning methods," Latin American Journal of Central Banking (previously Monetaria), Elsevier, vol. 1(1).
    20. Jesus Fernandez-Villaverde, 2020. "Simple Rules for a Complex World with Arti?cial Intelligence," PIER Working Paper Archive 20-010, Penn Institute for Economic Research, Department of Economics, University of Pennsylvania.

    More about this item

    Keywords

    Machine learning; Artificial intelligence; Explainable machine learning; Shapley values; Model interpretation;
    All these keywords.

    JEL classification:

    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fip:fedkrw:93596. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Zach Kastens (email available below). General contact details of provider: https://edirc.repec.org/data/frbkcus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.