IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v78y2022i3p880-893.html
   My bibliography  Save this article

Semiparametric analysis of clustered interval‐censored survival data using soft Bayesian additive regression trees (SBART)

Author

Listed:
  • Piyali Basak
  • Antonio Linero
  • Debajyoti Sinha
  • Stuart Lipsitz

Abstract

Popular parametric and semiparametric hazards regression models for clustered survival data are inappropriate and inadequate when the unknown effects of different covariates and clustering are complex. This calls for a flexible modeling framework to yield efficient survival prediction. Moreover, for some survival studies involving time to occurrence of some asymptomatic events, survival times are typically interval censored between consecutive clinical inspections. In this article, we propose a robust semiparametric model for clustered interval‐censored survival data under a paradigm of Bayesian ensemble learning, called soft Bayesian additive regression trees or SBART (Linero and Yang, 2018), which combines multiple sparse (soft) decision trees to attain excellent predictive accuracy. We develop a novel semiparametric hazards regression model by modeling the hazard function as a product of a parametric baseline hazard function and a nonparametric component that uses SBART to incorporate clustering, unknown functional forms of the main effects, and interaction effects of various covariates. In addition to being applicable for left‐censored, right‐censored, and interval‐censored survival data, our methodology is implemented using a data augmentation scheme which allows for existing Bayesian backfitting algorithms to be used. We illustrate the practical implementation and advantages of our method via simulation studies and an analysis of a prostate cancer surgery study where dependence on the experience and skill level of the physicians leads to clustering of survival times. We conclude by discussing our method's applicability in studies involving high‐dimensional data with complex underlying associations.

Suggested Citation

  • Piyali Basak & Antonio Linero & Debajyoti Sinha & Stuart Lipsitz, 2022. "Semiparametric analysis of clustered interval‐censored survival data using soft Bayesian additive regression trees (SBART)," Biometrics, The International Biometric Society, vol. 78(3), pages 880-893, September.
  • Handle: RePEc:bla:biomet:v:78:y:2022:i:3:p:880-893
    DOI: 10.1111/biom.13478
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13478
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13478?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Bani K. Mallick & David G. T. Denison & Adrian F. M. Smith, 1999. "Bayesian Survival Analysis Using A Mars Model," Biometrics, The International Biometric Society, vol. 55(4), pages 1071-1077, December.
    2. Umlauf, Nikolaus & Adler, Daniel & Kneib, Thomas & Lang, Stefan & Zeileis, Achim, 2015. "Structured Additive Regression Models: An R Interface to BayesX," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 63(i21).
    3. Maria De Iorio & Wesley O. Johnson & Peter Müller & Gary L. Rosner, 2009. "Bayesian Nonparametric Nonproportional Hazards Survival Modeling," Biometrics, The International Biometric Society, vol. 65(3), pages 762-771, September.
    4. Haiming Zhou & Timothy Hanson & Jiajia Zhang, 2017. "Generalized accelerated failure time spatial frailty model for arbitrarily censored data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(3), pages 495-515, July.
    5. Timothy Hanson & Mingan Yang, 2007. "Bayesian Semiparametric Proportional Odds Models," Biometrics, The International Biometric Society, vol. 63(1), pages 88-95, March.
    6. Antonio R. Linero & Yun Yang, 2018. "Bayesian regression tree ensembles that adapt to smoothness and sparsity," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(5), pages 1087-1110, November.
    7. Su Xiaogang & Zhou Tianni & Yan Xin & Fan Juanjuan & Yang Song, 2008. "Interaction Trees with Censored Survival Data," The International Journal of Biostatistics, De Gruyter, vol. 4(1), pages 1-26, January.
    8. Debajyoti Sinha & Ming-Hui Chen & Sujit K. Ghosh, 1999. "Bayesian Analysis and Model Selection for Interval-Censored Survival Data," Biometrics, The International Biometric Society, vol. 55(2), pages 585-590, June.
    9. Helene Roth & Stefan Lang & Helga Wagner, 2015. "Random intercept selection in structured additive regression models," Working Papers 2015-02, Faculty of Economics and Statistics, Universität Innsbruck.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jiajia Zhang & Timothy Hanson & Haiming Zhou, 2019. "Bayes factors for choosing among six common survival models," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(2), pages 361-379, April.
    2. Brown, Paul T. & Joshi, Chaitanya & Joe, Stephen & Rue, Håvard, 2021. "A novel method of marginalisation using low discrepancy sequences for integrated nested Laplace approximations," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    3. Jamie Roberman & Theophilus I. Emeto & Oyelola A. Adegboye, 2021. "Adverse Birth Outcomes Due to Exposure to Household Air Pollution from Unclean Cooking Fuel among Women of Reproductive Age in Nigeria," IJERPH, MDPI, vol. 18(2), pages 1-15, January.
    4. Gressani, Oswaldo & Lambert, Philippe, 2021. "Laplace approximations for fast Bayesian inference in generalized additive models based on P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
    5. Lianming Wang & David B. Dunson, 2011. "Semiparametric Bayes' Proportional Odds Models for Current Status Data with Underreporting," Biometrics, The International Biometric Society, vol. 67(3), pages 1111-1118, September.
    6. Roger Bivand & Giovanni Millo & Gianfranco Piras, 2021. "A Review of Software for Spatial Econometrics in R," Mathematics, MDPI, vol. 9(11), pages 1-40, June.
    7. David B. Dunson & Patricia Chulada & Samuel J. Arbes Jr, 2003. "Bayesian Modeling of Time-Varying and Waning Exposure Effects," Biometrics, The International Biometric Society, vol. 59(1), pages 83-91, March.
    8. Sibhatu, Kibrom T. & Steinhübel, Linda & Siregar, Hermanto & Qaim, Matin & Wollni, Meike, 2021. "Spatial Heterogeneity of Oil Palm Production in Indonesia: Implications for Intervention Strategies," 2021 Conference, August 17-31, 2021, Virtual 315222, International Association of Agricultural Economists.
    9. Sibhatu, Kibrom T. & Steinhübel, Linda & Siregar, Hermanto & Qaim, Matin & Wollni, Meike, 2022. "Spatial heterogeneity in smallholder oil palm production," Forest Policy and Economics, Elsevier, vol. 139(C).
    10. Angel G. Ortiz & Daniel Wiese & Kristen A. Sorice & Minhhuyen Nguyen & Evelyn T. González & Kevin A. Henry & Shannon M. Lynch, 2020. "Liver Cancer Incidence and Area-Level Geographic Disparities in Pennsylvania—A Geo-Additive Approach," IJERPH, MDPI, vol. 17(20), pages 1-20, October.
    11. Schmidt, Paul & Mühlau, Mark & Schmid, Volker, 2017. "Fitting large-scale structured additive regression models using Krylov subspace methods," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 59-75.
    12. Seiler, Johannes & Harttgen, Kenneth & Kneib, Thomas & Lang, Stefan, 2021. "Modelling children's anthropometric status using Bayesian distributional regression merging socio-economic and remote sensed data from South Asia and sub-Saharan Africa," Economics & Human Biology, Elsevier, vol. 40(C).
    13. Kenneth Harttgen & Stefan Lang & Judith Santer & Johannes Seiler, 2017. "Modeling under-5 mortality through multilevel structured additive regression with varying coefficients for Asia and Sub-Saharan Africa," Working Papers 2017-15, Faculty of Economics and Statistics, Universität Innsbruck.
    14. Bernhard Baumgartner & Daniel Guhl & Thomas Kneib & Winfried J. Steiner, 2018. "Flexible estimation of time-varying effects for frequently purchased retail goods: a modeling approach based on household panel data," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 40(4), pages 837-873, October.
    15. Simon N. Wood & Natalya Pya & Benjamin Säfken, 2016. "Smoothing Parameter and Model Selection for General Smooth Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1548-1563, October.
    16. Ruoqing Zhu & Ying-Qi Zhao & Guanhua Chen & Shuangge Ma & Hongyu Zhao, 2017. "Greedy outcome weighted tree learning of optimal personalized treatment rules," Biometrics, The International Biometric Society, vol. 73(2), pages 391-400, June.
    17. Linda Steinhübel & Johannes Wegmann & Oliver Mußhoff, 2020. "Digging deep and running dry—the adoption of borewell technology in the face of climate change and urbanization," Agricultural Economics, International Association of Agricultural Economists, vol. 51(5), pages 685-706, September.
    18. Im, Yunju & Tan, Aixin, 2021. "Bayesian subgroup analysis in regression using mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 162(C).
    19. Philippe Goulet Coulombe & Mikael Frenette & Karin Klieber, 2023. "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks," Working Papers 23-04, Chair in macroeconomics and forecasting, University of Quebec in Montreal's School of Management, revised Nov 2023.
    20. Youngjoo Cho & Debashis Ghosh, 2021. "Quantile-Based Subgroup Identification for Randomized Clinical Trials," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(1), pages 90-128, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:78:y:2022:i:3:p:880-893. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.