IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v36y2021i1d10.1007_s00180-020-01000-3.html
   My bibliography  Save this article

Clustering method for censored and collinear survival data

Author

Listed:
  • Silvia Liverani

    (Queen Mary University of London
    The British Library)

  • Lucy Leigh

    (Hunter Medical Research Institute
    University of Newcastle)

  • Irene L. Hudson

    (Royal Melbourne Institute of Technology (RMIT))

  • Julie E. Byles

    (University of Newcastle)

Abstract

In this paper we propose a Dirichlet process mixture model for censored survival data with covariates. This model is suitable in two scenarios. First, this method can be used to identify clusters determined by both the censored survival data and the predictors. Second, this method is suitable for highly correlated predictors, in cases when the usual survival models cannot be implemented because they would be unstable due to multicollinearity. The Dirichlet process mixture model links a response vector to covariate data through cluster membership and in this paper this model is extended for mixtures of Weibull distributions, which can be used to model survival times and also allow for censoring. We propose two variants of this model, one with a shape parameter common to all clusters (referred to as a global parameter) for the Weibull distributions and one with a cluster-specific shape parameter. The first satisfies the proportional hazard assumption, while the latter is very flexible, as it has the advantage of allowing estimation of the survival curve whether or not the proportional hazards assumption is satisfied. We present a simulation study and, to demonstrate the applicability of the method in practice, a real application to sleep surveys in older women from The Australian Longitudinal Study on Women’s Health. The method developed in the paper is available in the R package PReMiuM.

Suggested Citation

  • Silvia Liverani & Lucy Leigh & Irene L. Hudson & Julie E. Byles, 2021. "Clustering method for censored and collinear survival data," Computational Statistics, Springer, vol. 36(1), pages 35-60, March.
  • Handle: RePEc:spr:compst:v:36:y:2021:i:1:d:10.1007_s00180-020-01000-3
    DOI: 10.1007/s00180-020-01000-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-020-01000-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-020-01000-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Chung, Yeonseung & Dunson, David B., 2009. "Nonparametric Bayes Conditional Distribution Modeling With Variable Selection," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1646-1660.
    2. Dunson, David B. & Herring, Amy H. & Siega-Riz, Anna Maria, 2008. "Bayesian Inference on Changes in Response Densities Over Predictor Clusters," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1508-1517.
    3. Liverani, Silvia & Hastie, David I. & Azizi, Lamiae & Papathomas, Michail & Richardson, Sylvia, 2015. "PReMiuM: An R Package for Profile Regression Mixture Models Using Dirichlet Processes," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 64(i07).
    4. W. R. Gilks & P. Wild, 1992. "Adaptive Rejection Sampling for Gibbs Sampling," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 41(2), pages 337-348, June.
    5. Bigelow, Jamie L. & Dunson, David B., 2009. "Bayesian Semiparametric Joint Models for Functional Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 104(485), pages 26-36.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Eric Coker & Robert Gunier & Asa Bradman & Kim Harley & Katherine Kogut & John Molitor & Brenda Eskenazi, 2017. "Association between Pesticide Profiles Used on Agricultural Fields near Maternal Residences during Pregnancy and IQ at Age 7 Years," IJERPH, MDPI, vol. 14(5), pages 1-20, May.
    2. Liverani, Silvia & Hastie, David I. & Azizi, Lamiae & Papathomas, Michail & Richardson, Sylvia, 2015. "PReMiuM: An R Package for Profile Regression Mixture Models Using Dirichlet Processes," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 64(i07).
    3. Bruno Scarpa & David B. Dunson, 2014. "Enriched Stick-Breaking Processes for Functional Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 647-660, June.
    4. Lauren Hoskovec & Wande Benka-Coker & Rachel Severson & Sheryl Magzamen & Ander Wilson, 2021. "Model choice for estimating the association between exposure to chemical mixtures and health outcomes: A simulation study," PLOS ONE, Public Library of Science, vol. 16(3), pages 1-21, March.
    5. Jaeeun Yu & Jinsu Park & Taeryon Choi & Masahiro Hashizume & Yoonhee Kim & Yasushi Honda & Yeonseung Chung, 2021. "Nonparametric Bayesian Functional Meta-Regression: Applications in Environmental Epidemiology," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 26(1), pages 45-70, March.
    6. Daniele Durante & Sally Paganin & Bruno Scarpa & David B. Dunson, 2017. "Bayesian modelling of networks in complex business intelligence problems," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 66(3), pages 555-580, April.
    7. Pang, W. K. & Yang, Z. H. & Hou, S. H. & Leung, P. K., 2002. "Non-uniform random variate generation by the vertical strip method," European Journal of Operational Research, Elsevier, vol. 142(3), pages 595-609, November.
    8. Qi Li & Juan Lin & Jeffrey S. Racine, 2013. "Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(1), pages 57-65, January.
    9. Igari, Ryosuke & Hoshino, Takahiro, 2018. "A Bayesian data combination approach for repeated durations under unobserved missing indicators: Application to interpurchase-timing in marketing," Computational Statistics & Data Analysis, Elsevier, vol. 126(C), pages 150-166.
    10. Samantha Leorato & Maura Mezzetti, 2015. "Spatial Panel Data Model with error dependence: a Bayesian Separable Covariance Approach," CEIS Research Paper 338, Tor Vergata University, CEIS, revised 09 Apr 2015.
    11. Z. Rezaei Ghahroodi & M. Ganjali, 2013. "A Bayesian approach for analysing longitudinal nominal outcomes using random coefficients transitional generalized logit model: an application to the labour force survey data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 40(7), pages 1425-1445, July.
    12. Pati, Debdeep & Dunson, David B. & Tokdar, Surya T., 2013. "Posterior consistency in conditional distribution estimation," Journal of Multivariate Analysis, Elsevier, vol. 116(C), pages 456-472.
    13. Antonello Loddo & Shawn Ni & Dongchu Sun, 2011. "Selection of Multivariate Stochastic Volatility Models via Bayesian Stochastic Search," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 29(3), pages 342-355, July.
    14. Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian multiple imputation for regression models with missing mixed continuous–discrete covariates," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(3), pages 803-825, June.
    15. Chen, Ming-Hui & Ibrahim, Joseph G. & Sinha, Debajyoti, 2004. "A new joint model for longitudinal and survival data with a cure fraction," Journal of Multivariate Analysis, Elsevier, vol. 91(1), pages 18-34, October.
    16. Nandram, Balgobin & Zelterman, Daniel, 2007. "Computational Bayesian inference for estimating the size of a finite population," Computational Statistics & Data Analysis, Elsevier, vol. 51(6), pages 2934-2945, March.
    17. Samaneh Mahabadi & Mojtaba Ganjali, 2015. "A Bayesian approach for sensitivity analysis of incomplete multivariate longitudinal data with potential nonrandom dropout," METRON, Springer;Sapienza Università di Roma, vol. 73(3), pages 397-417, December.
    18. Wang, Ketong & Porter, Michael D., 2018. "Optimal Bayesian clustering using non-negative matrix factorization," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 395-411.
    19. Han, Shengtong & Zhang, Hongmei & Karmaus, Wilfried & Roberts, Graham & Arshad, Hasan, 2017. "Adjusting background noise in cluster analyses of longitudinal data," Computational Statistics & Data Analysis, Elsevier, vol. 109(C), pages 93-104.
    20. Fuentes-García, Ruth & Mena, Ramsés H. & Walker, Stephen G., 2009. "A nonparametric dependent process for Bayesian regression," Statistics & Probability Letters, Elsevier, vol. 79(8), pages 1112-1119, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:36:y:2021:i:1:d:10.1007_s00180-020-01000-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.