IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v36y2021i1d10.1007_s00180-020-01000-3.html
   My bibliography  Save this article

Clustering method for censored and collinear survival data

Author

Listed:
  • Silvia Liverani

    (Queen Mary University of London
    The British Library)

  • Lucy Leigh

    (Hunter Medical Research Institute
    University of Newcastle)

  • Irene L. Hudson

    (Royal Melbourne Institute of Technology (RMIT))

  • Julie E. Byles

    (University of Newcastle)

Abstract

In this paper we propose a Dirichlet process mixture model for censored survival data with covariates. This model is suitable in two scenarios. First, this method can be used to identify clusters determined by both the censored survival data and the predictors. Second, this method is suitable for highly correlated predictors, in cases when the usual survival models cannot be implemented because they would be unstable due to multicollinearity. The Dirichlet process mixture model links a response vector to covariate data through cluster membership and in this paper this model is extended for mixtures of Weibull distributions, which can be used to model survival times and also allow for censoring. We propose two variants of this model, one with a shape parameter common to all clusters (referred to as a global parameter) for the Weibull distributions and one with a cluster-specific shape parameter. The first satisfies the proportional hazard assumption, while the latter is very flexible, as it has the advantage of allowing estimation of the survival curve whether or not the proportional hazards assumption is satisfied. We present a simulation study and, to demonstrate the applicability of the method in practice, a real application to sleep surveys in older women from The Australian Longitudinal Study on Women’s Health. The method developed in the paper is available in the R package PReMiuM.

Suggested Citation

  • Silvia Liverani & Lucy Leigh & Irene L. Hudson & Julie E. Byles, 2021. "Clustering method for censored and collinear survival data," Computational Statistics, Springer, vol. 36(1), pages 35-60, March.
  • Handle: RePEc:spr:compst:v:36:y:2021:i:1:d:10.1007_s00180-020-01000-3
    DOI: 10.1007/s00180-020-01000-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-020-01000-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-020-01000-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Chung, Yeonseung & Dunson, David B., 2009. "Nonparametric Bayes Conditional Distribution Modeling With Variable Selection," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1646-1660.
    2. Bigelow, Jamie L. & Dunson, David B., 2009. "Bayesian Semiparametric Joint Models for Functional Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 104(485), pages 26-36.
    3. Dunson, David B. & Herring, Amy H. & Siega-Riz, Anna Maria, 2008. "Bayesian Inference on Changes in Response Densities Over Predictor Clusters," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1508-1517.
    4. Liverani, Silvia & Hastie, David I. & Azizi, Lamiae & Papathomas, Michail & Richardson, Sylvia, 2015. "PReMiuM: An R Package for Profile Regression Mixture Models Using Dirichlet Processes," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 64(i07).
    5. W. R. Gilks & P. Wild, 1992. "Adaptive Rejection Sampling for Gibbs Sampling," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 41(2), pages 337-348, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lavigne, Aurore & Liverani, Silvia, 2024. "Quantifying the uncertainty of partitions for infinite mixture models," Statistics & Probability Letters, Elsevier, vol. 204(C).
    2. Rifai Afin & Keresztély Tibor & Cserháti Ilona, 2025. "Firm performance and markets: survival analysis of medium and large manufacturing enterprises in Indonesia," Economia e Politica Industriale: Journal of Industrial and Business Economics, Springer;Associazione Amici di Economia e Politica Industriale, vol. 52(1), pages 107-151, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Eric Coker & Robert Gunier & Asa Bradman & Kim Harley & Katherine Kogut & John Molitor & Brenda Eskenazi, 2017. "Association between Pesticide Profiles Used on Agricultural Fields near Maternal Residences during Pregnancy and IQ at Age 7 Years," IJERPH, MDPI, vol. 14(5), pages 1-20, May.
    2. Liverani, Silvia & Hastie, David I. & Azizi, Lamiae & Papathomas, Michail & Richardson, Sylvia, 2015. "PReMiuM: An R Package for Profile Regression Mixture Models Using Dirichlet Processes," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 64(i07).
    3. Jaeeun Yu & Jinsu Park & Taeryon Choi & Masahiro Hashizume & Yoonhee Kim & Yasushi Honda & Yeonseung Chung, 2021. "Nonparametric Bayesian Functional Meta-Regression: Applications in Environmental Epidemiology," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 26(1), pages 45-70, March.
    4. Bruno Scarpa & David B. Dunson, 2014. "Enriched Stick-Breaking Processes for Functional Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 647-660, June.
    5. Lauren Hoskovec & Wande Benka-Coker & Rachel Severson & Sheryl Magzamen & Ander Wilson, 2021. "Model choice for estimating the association between exposure to chemical mixtures and health outcomes: A simulation study," PLOS ONE, Public Library of Science, vol. 16(3), pages 1-21, March.
    6. Daniele Durante & Sally Paganin & Bruno Scarpa & David B. Dunson, 2017. "Bayesian modelling of networks in complex business intelligence problems," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 66(3), pages 555-580, April.
    7. Tamara Broderick & Robert Gramacy, 2011. "Classification and Categorical Inputs with Treed Gaussian Process Models," Journal of Classification, Springer;The Classification Society, vol. 28(2), pages 244-270, July.
    8. Annalina Sarra & Lara Fontanella & Simone Zio, 2019. "Identifying Students at Risk of Academic Failure Within the Educational Data Mining Framework," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 146(1), pages 41-60, November.
    9. Pang, W. K. & Yang, Z. H. & Hou, S. H. & Leung, P. K., 2002. "Non-uniform random variate generation by the vertical strip method," European Journal of Operational Research, Elsevier, vol. 142(3), pages 595-609, November.
    10. Roy, Vivekananda, 2014. "Efficient estimation of the link function parameter in a robust Bayesian binary regression model," Computational Statistics & Data Analysis, Elsevier, vol. 73(C), pages 87-102.
    11. Qi Li & Juan Lin & Jeffrey S. Racine, 2013. "Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(1), pages 57-65, January.
    12. Wu, Lang, 2007. "A computationally efficient method for nonlinear mixed-effects models with nonignorable missing data in time-varying covariates," Computational Statistics & Data Analysis, Elsevier, vol. 51(5), pages 2410-2419, February.
    13. Austin Menger & Md. Tuhin Sheikh & Ming-Hui Chen, 2024. "Bayesian Modeling of Survival Data in the Presence of Competing Risks with Cure Fractions and Masked Causes," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 86(1), pages 199-227, November.
    14. Renat Sergazinov & Andrew Leroux & Erjia Cui & Ciprian Crainiceanu & R. Nisha Aurora & Naresh M. Punjabi & Irina Gaynanova, 2023. "A case study of glucose levels during sleep using multilevel fast function on scalar regression inference," Biometrics, The International Biometric Society, vol. 79(4), pages 3873-3882, December.
    15. Igari, Ryosuke & Hoshino, Takahiro, 2018. "A Bayesian data combination approach for repeated durations under unobserved missing indicators: Application to interpurchase-timing in marketing," Computational Statistics & Data Analysis, Elsevier, vol. 126(C), pages 150-166.
    16. Zhong, Peng & Huser, Raphaël & Opitz, Thomas, 2024. "Exact Simulation of Max-Infinitely Divisible Processes," Econometrics and Statistics, Elsevier, vol. 30(C), pages 96-109.
    17. Cai, Bo & Lin, Xiaoyan & Wang, Lianming, 2011. "Bayesian proportional hazards model for current status data with monotone splines," Computational Statistics & Data Analysis, Elsevier, vol. 55(9), pages 2644-2651, September.
    18. Hao, Meiling & Lin, Yuanyuan & Shen, Guohao & Su, Wen, 2023. "Nonparametric inference on smoothed quantile regression process," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    19. Samantha Leorato & Maura Mezzetti, 2015. "Spatial Panel Data Model with error dependence: a Bayesian Separable Covariance Approach," CEIS Research Paper 338, Tor Vergata University, CEIS, revised 09 Apr 2015.
    20. Griffin, J. E. & Steel, M. F. J., 2004. "Semiparametric Bayesian inference for stochastic frontier models," Journal of Econometrics, Elsevier, vol. 123(1), pages 121-152, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:36:y:2021:i:1:d:10.1007_s00180-020-01000-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.