IDEAS home Printed from https://ideas.repec.org/a/spr/testjl/v34y2025i1d10.1007_s11749-024-00942-w.html
   My bibliography  Save this article

Bayesian inference and cure rate modeling for event history data

Author

Listed:
  • Panagiotis Papastamoulis

    (Athens University of Economics and Business)

  • Fotios S. Milienos

    (Panteion University of Social and Political Sciences)

Abstract

Estimating model parameters of a general family of cure models is always a challenging task mainly due to flatness and multimodality of the likelihood function. In this work, we propose a fully Bayesian approach in order to overcome these issues. Posterior inference is carried out by constructing a Metropolis-coupled Markov chain Monte Carlo (MCMC) sampler, which combines Gibbs sampling for the latent cure indicators and Metropolis–Hastings steps with Langevin diffusion dynamics for parameter updates. The main MCMC algorithm is embedded within a parallel tempering scheme by considering heated versions of the target posterior distribution. It is demonstrated that along the considered simulation study the proposed algorithm freely explores the multimodal posterior distribution and produces robust point estimates, while it outperforms maximum likelihood estimation via the Expectation–Maximization algorithm. A by-product of our Bayesian implementation is to control the False Discovery Rate when classifying items as cured or not. Finally, the proposed method is illustrated in a real dataset which refers to recidivism for offenders released from prison; the event of interest is whether the offender was re-incarcerated after probation or not.

Suggested Citation

  • Panagiotis Papastamoulis & Fotios S. Milienos, 2025. "Bayesian inference and cure rate modeling for event history data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 34(1), pages 1-27, March.
  • Handle: RePEc:spr:testjl:v:34:y:2025:i:1:d:10.1007_s11749-024-00942-w
    DOI: 10.1007/s11749-024-00942-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11749-024-00942-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11749-024-00942-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Amanda D’Andrea & Ricardo Rocha & Vera Tomazella & Francisco Louzada, 2018. "Negative Binomial Kumaraswamy-G Cure Rate Regression Model," JRFM, MDPI, vol. 11(1), pages 1-14, January.
    2. Wende Clarence Safari & Ignacio López-de-Ullibarri & María Amalia Jácome, 2023. "Latency function estimation under the mixture cure model when the cure status is available," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(3), pages 608-627, July.
    3. Zeng, Donglin & Yin, Guosheng & Ibrahim, Joseph G., 2006. "Semiparametric Transformation Models for Survival Data With a Cure Fraction," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 670-684, June.
    4. Panagiotis Papastamoulis & Magnus Rattray, 2018. "A Bayesian model selection approach for identifying differentially expressed transcripts from RNA sequencing data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 67(1), pages 3-23, January.
    5. Mark Girolami & Ben Calderhead, 2011. "Riemann manifold Langevin and Hamiltonian Monte Carlo methods," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(2), pages 123-214, March.
    6. Rocha, Ricardo & Nadarajah, Saralees & Tomazella, Vera & Louzada, Francisco, 2017. "A new class of defective models based on the Marshall–Olkin family of distributions for cure rate modeling," Computational Statistics & Data Analysis, Elsevier, vol. 107(C), pages 48-63.
    7. A. Tsodikov, 2003. "Semiparametric models: a generalized self‐consistency approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(3), pages 759-774, August.
    8. Biernacki, Christophe & Celeux, Gilles & Govaert, Gerard, 2003. "Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 41(3-4), pages 561-575, January.
    9. Martin G. Larson & Gregg E. Dinse, 1985. "A Mixture Model for the Regression Analysis of Competing Risks Data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 34(3), pages 201-211, November.
    10. Guoqing Diao & Guosheng Yin, 2012. "A general transformation class of semiparametric cure rate frailty models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 64(5), pages 959-989, October.
    11. Andrew Gelman, 2003. "A Bayesian Formulation of Exploratory Data Analysis and Goodness‐of‐fit Testing," International Statistical Review, International Statistical Institute, vol. 71(2), pages 369-382, August.
    12. Gareth O. Roberts & Jeffrey S. Rosenthal, 1998. "Optimal scaling of discrete approximations to Langevin diffusions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(1), pages 255-268.
    13. Cooner, Freda & Banerjee, Sudipto & Carlin, Bradley P. & Sinha, Debajyoti, 2007. "Flexible Cure Rate Modeling Under Latent Activation Schemes," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 560-572, June.
    14. Peter Muller & Giovanni Parmigiani & Christian Robert & Judith Rousseau, 2004. "Optimal Sample Size for Multiple Testing: The Case of Gene Expression Microarrays," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 990-1001, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Arnak S. Dalalyan, 2017. "Theoretical guarantees for approximate sampling from smooth and log-concave densities," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(3), pages 651-676, June.
    2. Gressani, Oswaldo & Lambert, Philippe, 2016. "Fast Bayesian inference in semi-parametric P-spline cure survival models using Laplace approximations," LIDAM Discussion Papers ISBA 2016041, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    3. Bremhorst, Vincent & Lambert, Philippe, 2016. "Flexible estimation in cure survival models using Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 270-284.
    4. López-Cheda, Ana & Cao, Ricardo & Jácome, M. Amalia & Van Keilegom, Ingrid, 2017. "Nonparametric incidence estimation and bootstrap bandwidth selection in mixture cure models," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 144-165.
    5. Burda Martin & Maheu John M., 2013. "Bayesian adaptively updated Hamiltonian Monte Carlo with an application to high-dimensional BEKK GARCH models," Studies in Nonlinear Dynamics & Econometrics, De Gruyter, vol. 17(4), pages 345-372, September.
    6. Dang, Khue-Dung & Quiroz, Matias & Kohn, Robert & Tran, Minh-Ngoc & Villani, Mattias, 2019. "Hamiltonian Monte Carlo with Energy Conserving Subsampling," Working Paper Series 372, Sveriges Riksbank (Central Bank of Sweden).
    7. Lopez-Cheda , Ana & Cao, Ricardo & Jacome, Maria Amalia & Van Keilegom, Ingrid, 2015. "Nonparametric incidence and latency estimation in mixture cure models," LIDAM Discussion Papers ISBA 2015014, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    8. Beskos, A. & Pinski, F.J. & Sanz-Serna, J.M. & Stuart, A.M., 2011. "Hybrid Monte Carlo on Hilbert spaces," Stochastic Processes and their Applications, Elsevier, vol. 121(10), pages 2201-2230, October.
    9. Lu Wang & Pang Du & Hua Liang, 2012. "Two-Component Mixture Cure Rate Model with Spline Estimated Nonparametric Components," Biometrics, The International Biometric Society, vol. 68(3), pages 726-735, September.
    10. Olayidé Boussari & Laurent Bordes & Gaëlle Romain & Marc Colonna & Nadine Bossard & Laurent Remontet & Valérie Jooste, 2021. "Modeling excess hazard with time‐to‐cure as a parameter," Biometrics, The International Biometric Society, vol. 77(4), pages 1289-1302, December.
    11. Ana Ezquerro & Brais Cancela & Ana López-Cheda, 2023. "On the Reliability of Machine Learning Models for Survival Analysis When Cure Is a Possibility," Mathematics, MDPI, vol. 11(19), pages 1-21, October.
    12. Mike Tsionas & Marwan Izzeldin & Lorenzo Trapani, 2019. "Bayesian estimation of large dimensional time varying VARs using copulas," Papers 1912.12527, arXiv.org.
    13. Bremhorst, Vincent & Lambert, Philippe, 2013. "Flexible estimation in cure survival models using Bayesian P-splines," LIDAM Discussion Papers ISBA 2013039, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    14. M Ludkin & C Sherlock, 2023. "Hug and hop: a discrete-time, nonreversible Markov chain Monte Carlo algorithm," Biometrika, Biometrika Trust, vol. 110(2), pages 301-318.
    15. Yuan Mengdie & Diao Guoqing, 2014. "Semiparametric Odds Rate Model for Modeling Short-Term and Long-Term Effects with Application to a Breast Cancer Genetic Study," The International Journal of Biostatistics, De Gruyter, vol. 10(2), pages 231-249, November.
    16. Gressani, Oswaldo & Lambert, Philippe, 2018. "Fast Bayesian inference using Laplace approximations in a flexible promotion time cure model based on P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 124(C), pages 151-167.
    17. Barreto-Souza, Wagner, 2015. "Long-term survival models with overdispersed number of competing causes," Computational Statistics & Data Analysis, Elsevier, vol. 91(C), pages 51-63.
    18. Xifara, T. & Sherlock, C. & Livingstone, S. & Byrne, S. & Girolami, M., 2014. "Langevin diffusions and the Metropolis-adjusted Langevin algorithm," Statistics & Probability Letters, Elsevier, vol. 91(C), pages 14-19.
    19. Martin Burda & John Maheu, 2011. "Bayesian Adaptive Hamiltonian Monte Carlo with an Application to High-Dimensional BEKK GARCH Models," Working Papers tecipa-438, University of Toronto, Department of Economics.
    20. Hanin, Leonid & Huang, Li-Shan, 2014. "Identifiability of cure models revisited," Journal of Multivariate Analysis, Elsevier, vol. 130(C), pages 261-274.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:testjl:v:34:y:2025:i:1:d:10.1007_s11749-024-00942-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.