IDEAS home Printed from https://ideas.repec.org/a/eee/reensy/v207y2021ics0951832020308784.html
   My bibliography  Save this article

Assessing variable activity for Bayesian regression trees

Author

Listed:
  • Horiguchi, Akira
  • Pratola, Matthew T.
  • Santner, Thomas J.

Abstract

Bayesian Additive Regression Trees (BART) are non-parametric models that can capture complex exogenous variable effects. In any regression problem, it is often of interest to learn which variables are most active. Variable activity in BART is usually measured by counting the number of times a tree splits for each variable. Such one-way counts have the advantage of fast computations. Despite their convenience, one-way counts have several issues. They are statistically unjustified, cannot distinguish between main effects and interaction effects, and become inflated when measuring interaction effects. An alternative method well-established in the literature is SobolÌ indices, a variance-based global sensitivity analysis technique. However, these indices often require Monte Carlo integration, which can be computationally expensive. This paper provides analytic expressions for SobolÌ indices for BART posterior samples. These expressions are easy to interpret and are computationally feasible. Furthermore, we will show a fascinating connection between first-order (main-effects) SobolÌ indices and one-way counts. We also introduce a novel ranking method, and use this to demonstrate that the proposed indices preserve the SobolÌ -based rank order of variable importance. Finally, we compare these methods using analytic test functions and the En-ROADS climate impacts simulator.

Suggested Citation

  • Horiguchi, Akira & Pratola, Matthew T. & Santner, Thomas J., 2021. "Assessing variable activity for Bayesian regression trees," Reliability Engineering and System Safety, Elsevier, vol. 207(C).
  • Handle: RePEc:eee:reensy:v:207:y:2021:i:c:s0951832020308784
    DOI: 10.1016/j.ress.2020.107391
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0951832020308784
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ress.2020.107391?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Gramacy, Robert B. & Taddy, Matthew Alan, 2010. "Categorical Inputs, Sensitivity Analysis, Optimization and Importance Tempering with tgp Version 2, an R Package for Treed Gaussian Process Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i06).
    2. Sudret, Bruno, 2008. "Global sensitivity analysis using polynomial chaos expansions," Reliability Engineering and System Safety, Elsevier, vol. 93(7), pages 964-979.
    3. Mara, Thierry A. & Tarantola, Stefano, 2012. "Variance-based sensitivity indices for models with dependent inputs," Reliability Engineering and System Safety, Elsevier, vol. 107(C), pages 115-121.
    4. Crestaux, Thierry & Le Maıˆtre, Olivier & Martinez, Jean-Marc, 2009. "Polynomial chaos expansion for sensitivity analysis," Reliability Engineering and System Safety, Elsevier, vol. 94(7), pages 1161-1172.
    5. Jeremy E. Oakley & Anthony O'Hagan, 2004. "Probabilistic sensitivity analysis of complex models: a Bayesian approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(3), pages 751-769, August.
    6. Antonio R. Linero, 2018. "Bayesian Regression Trees for High-Dimensional Prediction and Variable Selection," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(522), pages 626-636, April.
    7. Gramacy, Robert B & Lee, Herbert K. H, 2008. "Bayesian Treed Gaussian Process Models With an Application to Computer Modeling," Journal of the American Statistical Association, American Statistical Association, vol. 103(483), pages 1119-1130.
    8. Kucherenko, Sergei & Feil, Balazs & Shah, Nilay & Mauntz, Wolfgang, 2011. "The identification of model effective dimensions using global sensitivity analysis," Reliability Engineering and System Safety, Elsevier, vol. 96(4), pages 440-449.
    9. Gramacy, Robert B., 2007. "tgp: An R Package for Bayesian Nonstationary, Semiparametric Nonlinear Regression and Design by Treed Gaussian Process Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 19(i09).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Shi, Wen & Zhou, Qing & Zhou, Yanju, 2023. "An efficient elementary effect-based method for sensitivity analysis in identifying main and two-factor interaction effects," Reliability Engineering and System Safety, Elsevier, vol. 237(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matieyendou Lamboni, 2020. "Uncertainty quantification: a minimum variance unbiased (joint) estimator of the non-normalized Sobol’ indices," Statistical Papers, Springer, vol. 61(5), pages 1939-1970, October.
    2. Chen, Xin & Molina-Cristóbal, Arturo & Guenov, Marin D. & Riaz, Atif, 2019. "Efficient method for variance-based sensitivity analysis," Reliability Engineering and System Safety, Elsevier, vol. 181(C), pages 97-115.
    3. Wu, Zeping & Wang, Donghui & Okolo N, Patrick & Hu, Fan & Zhang, Weihua, 2016. "Global sensitivity analysis using a Gaussian Radial Basis Function metamodel," Reliability Engineering and System Safety, Elsevier, vol. 154(C), pages 171-179.
    4. Wei, Pengfei & Lu, Zhenzhou & Song, Jingwen, 2015. "Variable importance analysis: A comprehensive review," Reliability Engineering and System Safety, Elsevier, vol. 142(C), pages 399-432.
    5. Daniel W. Gladish & Daniel E. Pagendam & Luk J. M. Peeters & Petra M. Kuhnert & Jai Vaze, 2018. "Emulation Engines: Choice and Quantification of Uncertainty for Complex Hydrological Models," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(1), pages 39-62, March.
    6. Wu, Zeping & Wang, Wenjie & Wang, Donghui & Zhao, Kun & Zhang, Weihua, 2019. "Global sensitivity analysis using orthogonal augmented radial basis function," Reliability Engineering and System Safety, Elsevier, vol. 185(C), pages 291-302.
    7. Al Ali, Hannah & Daneshkhah, Alireza & Boutayeb, Abdesslam & Malunguza, Noble Jahalamajaha & Mukandavire, Zindoga, 2022. "Exploring dynamical properties of a Type 1 diabetes model using sensitivity approaches," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 201(C), pages 324-342.
    8. Lambert, Romain S.C. & Lemke, Frank & Kucherenko, Sergei S. & Song, Shufang & Shah, Nilay, 2016. "Global sensitivity analysis using sparse high dimensional model representations generated by the group method of data handling," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 128(C), pages 42-54.
    9. Deman, G. & Konakli, K. & Sudret, B. & Kerrou, J. & Perrochet, P. & Benabderrahmane, H., 2016. "Using sparse polynomial chaos expansions for the global sensitivity analysis of groundwater lifetime expectancy in a multi-layered hydrogeological model," Reliability Engineering and System Safety, Elsevier, vol. 147(C), pages 156-169.
    10. Touzani, Samir & Busby, Daniel, 2013. "Smoothing spline analysis of variance approach for global sensitivity analysis of computer codes," Reliability Engineering and System Safety, Elsevier, vol. 112(C), pages 67-81.
    11. Melito, Gian Marco & Müller, Thomas Stephan & Badeli, Vahid & Ellermann, Katrin & Brenn, Günter & Reinbacher-Köstinger, Alice, 2021. "Sensitivity analysis study on the effect of the fluid mechanics assumptions for the computation of electrical conductivity of flowing human blood," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    12. Matieyendou Lamboni, 2018. "Global sensitivity analysis: a generalized, unbiased and optimal estimator of total-effect variance," Statistical Papers, Springer, vol. 59(1), pages 361-386, March.
    13. Borgonovo, Emanuele & Plischke, Elmar, 2016. "Sensitivity analysis: A review of recent advances," European Journal of Operational Research, Elsevier, vol. 248(3), pages 869-887.
    14. Anstett-Collin, F. & Goffart, J. & Mara, T. & Denis-Vidal, L., 2015. "Sensitivity analysis of complex models: Coping with dynamic and static inputs," Reliability Engineering and System Safety, Elsevier, vol. 134(C), pages 268-275.
    15. Becker, William, 2020. "Metafunctions for benchmarking in sensitivity analysis," Reliability Engineering and System Safety, Elsevier, vol. 204(C).
    16. Kapusuzoglu, Berkcan & Mahadevan, Sankaran, 2021. "Information fusion and machine learning for sensitivity analysis using physics knowledge and experimental data," Reliability Engineering and System Safety, Elsevier, vol. 214(C).
    17. Davis, Casey B. & Hans, Christopher M. & Santner, Thomas J., 2021. "Prediction of non-stationary response functions using a Bayesian composite Gaussian process," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
    18. Mara, Thierry A. & Becker, William E., 2021. "Polynomial chaos expansion for sensitivity analysis of model output with dependent inputs," Reliability Engineering and System Safety, Elsevier, vol. 214(C).
    19. Konakli, Katerina & Sudret, Bruno, 2016. "Global sensitivity analysis using low-rank tensor approximations," Reliability Engineering and System Safety, Elsevier, vol. 156(C), pages 64-83.
    20. Arnst, M. & Goyal, K., 2017. "Sensitivity analysis of parametric uncertainties and modeling errors in computational-mechanics models by using a generalized probabilistic modeling approach," Reliability Engineering and System Safety, Elsevier, vol. 167(C), pages 394-405.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:reensy:v:207:y:2021:i:c:s0951832020308784. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/reliability-engineering-and-system-safety .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.