IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i19p4150-d1252508.html
   My bibliography  Save this article

On the Reliability of Machine Learning Models for Survival Analysis When Cure Is a Possibility

Author

Listed:
  • Ana Ezquerro

    (Faculty of Informatics, University of A Coruña, 15071 A Coruña, Spain)

  • Brais Cancela

    (CITIC, LIDIA Group, Department of Computer Science, University of A Coruña, 15071 A Coruña, Spain)

  • Ana López-Cheda

    (CITIC, MODES Group, Department of Mathematics, University of A Coruña, 15071 A Coruña, Spain)

Abstract

In classical survival analysis, it is assumed that all the individuals will experience the event of interest. However, if there is a proportion of subjects who will never experience the event, then a standard survival approach is not appropriate, and cure models should be considered instead. This paper deals with the problem of adapting a machine learning approach for classical survival analysis to a situation when cure (i.e., not suffering the event) is a possibility. Specifically, a brief review of cure models and recent machine learning methodologies is presented, and an adaptation of machine learning approaches to account for cured individuals is introduced. In order to validate the proposed methods, we present an extensive simulation study in which we compare the performance of the adapted machine learning algorithms with existing cure models. The results show the good behavior of the semiparametric or the nonparametric approaches, depending on the simulated scenario. The practical utility of the methodology is showcased through two real-world dataset illustrations. In the first one, the results show the gain of using the nonparametric mixture cure model approach. In the second example, the results show the poor performance of some machine learning methods for small sample sizes.

Suggested Citation

  • Ana Ezquerro & Brais Cancela & Ana López-Cheda, 2023. "On the Reliability of Machine Learning Models for Survival Analysis When Cure Is a Possibility," Mathematics, MDPI, vol. 11(19), pages 1-21, October.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:19:p:4150-:d:1252508
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/19/4150/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/19/4150/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yingwei Peng & Keith B. G. Dear, 2000. "A Nonparametric Mixture Model for Cure Rate Estimation," Biometrics, The International Biometric Society, vol. 56(1), pages 237-243, March.
    2. Ana López-Cheda & M. Amalia Jácome & Ricardo Cao, 2017. "Nonparametric latency estimation for mixture cure models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 26(2), pages 353-376, June.
    3. Zeng, Donglin & Yin, Guosheng & Ibrahim, Joseph G., 2006. "Semiparametric Transformation Models for Survival Data With a Cure Fraction," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 670-684, June.
    4. Yijun Wang & Jiajia Zhang & Yincai Tang, 2020. "Semiparametric estimation for accelerated failure time mixture cure model allowing non-curable competing risk," Statistical Theory and Related Fields, Taylor & Francis Journals, vol. 4(1), pages 97-108, July.
    5. Kani Chen, 2002. "Semiparametric analysis of transformation models with censored data," Biometrika, Biometrika Trust, vol. 89(3), pages 659-668, August.
    6. Jiang, Cuiqing & Wang, Zhao & Zhao, Huimin, 2019. "A prediction-driven mixture cure model and its application in credit scoring," European Journal of Operational Research, Elsevier, vol. 277(1), pages 20-31.
    7. Amico, Mailis & Van Keilegom, Ingrid & Legrand, Catherine, 2019. "The Single-Index/Cox Mixture Cure Model," LIDAM Reprints ISBA 2019007, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    8. Lubomír Štěpánek & Filip Habarta & Ivana Malá & Ladislav Štěpánek & Marie Nakládalová & Alena Boriková & Luboš Marek, 2023. "Machine Learning at the Service of Survival Analysis: Predictions Using Time-to-Event Decomposition and Classification Applied to a Decrease of Blood Antibodies against COVID-19," Mathematics, MDPI, vol. 11(4), pages 1-27, February.
    9. Hu, Tao & Xiang, Liming, 2013. "Efficient estimation for semiparametric cure models with interval-censored data," Journal of Multivariate Analysis, Elsevier, vol. 121(C), pages 139-151.
    10. Lu Wang & Pang Du & Hua Liang, 2012. "Two-Component Mixture Cure Rate Model with Spline Estimated Nonparametric Components," Biometrics, The International Biometric Society, vol. 68(3), pages 726-735, September.
    11. A. Tsodikov, 2003. "Semiparametric models: a generalized self‐consistency approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(3), pages 759-774, August.
    12. Maïlis Amico & Ingrid Van Keilegom & Catherine Legrand, 2019. "The single‐index/Cox mixture cure model," Biometrics, The International Biometric Society, vol. 75(2), pages 452-462, June.
    13. U U Müller & I Van Keilegom, 2019. "Goodness-of-fit tests for the cure rate in a mixture cure model," Biometrika, Biometrika Trust, vol. 106(1), pages 211-227.
    14. Yujing Xie & Zhangsheng Yu, 2021. "Mixture cure rate models with neural network estimated nonparametric components," Computational Statistics, Springer, vol. 36(4), pages 2467-2489, December.
    15. Håvard Kvamme & Ørnulf Borgan, 2021. "Continuous and discrete-time survival prediction with neural networks," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 27(4), pages 710-736, October.
    16. Ciampi, Antonio & Thiffault, Johanne & Nakache, Jean-Pierre & Asselain, Bernard, 1986. "Stratification by stepwise regression, correspondence analysis and recursive partition: a comparison of three methods of analysis for survival data with covariates," Computational Statistics & Data Analysis, Elsevier, vol. 4(3), pages 185-204, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. López-Cheda, Ana & Cao, Ricardo & Jácome, M. Amalia & Van Keilegom, Ingrid, 2017. "Nonparametric incidence estimation and bootstrap bandwidth selection in mixture cure models," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 144-165.
    2. Lopez-Cheda , Ana & Cao, Ricardo & Jacome, Maria Amalia & Van Keilegom, Ingrid, 2015. "Nonparametric incidence and latency estimation in mixture cure models," LIDAM Discussion Papers ISBA 2015014, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    3. Bremhorst, Vincent & Lambert, Philippe, 2013. "Flexible estimation in cure survival models using Bayesian P-splines," LIDAM Discussion Papers ISBA 2013039, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    4. Bremhorst, Vincent & Lambert, Philippe, 2016. "Flexible estimation in cure survival models using Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 270-284.
    5. Amico, Mailis & Van Keilegom, Ingrid, 2017. "Cure models in survival analysis," LIDAM Discussion Papers ISBA 2017007, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    6. Philippe Lambert & Vincent Bremhorst, 2020. "Inclusion of time‐varying covariates in cure survival models with an application in fertility studies," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(1), pages 333-354, January.
    7. Narisetty, Naveen & Koenker, Roger, 2022. "Censored quantile regression survival models with a cure proportion," Journal of Econometrics, Elsevier, vol. 226(1), pages 192-203.
    8. Ortega, Edwin M.M. & Cordeiro, Gauss M. & Lemonte, Artur J., 2012. "A log-linear regression model for the β-Birnbaum–Saunders distribution with censored data," Computational Statistics & Data Analysis, Elsevier, vol. 56(3), pages 698-718.
    9. Guoqing Diao & Ao Yuan, 2019. "A class of semiparametric cure models with current status data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(1), pages 26-51, January.
    10. Ana López-Cheda & Yingwei Peng & María Amalia Jácome, 2023. "Nonparametric estimation in mixture cure models with covariates," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(2), pages 467-495, June.
    11. Gressani, Oswaldo & Lambert, Philippe, 2016. "Fast Bayesian inference in semi-parametric P-spline cure survival models using Laplace approximations," LIDAM Discussion Papers ISBA 2016041, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    12. Hu, Tao & Xiang, Liming, 2016. "Partially linear transformation cure models for interval-censored data," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 257-269.
    13. Li, Shuwei & Hu, Tao & Zhao, Xingqiu & Sun, Jianguo, 2019. "A class of semiparametric transformation cure models for interval-censored failure time data," Computational Statistics & Data Analysis, Elsevier, vol. 133(C), pages 153-165.
    14. Gressani, Oswaldo & Lambert, Philippe, 2018. "Fast Bayesian inference using Laplace approximations in a flexible promotion time cure model based on P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 124(C), pages 151-167.
    15. Xiaoguang Wang & Ziwen Wang, 2021. "EM algorithm for the additive risk mixture cure model with interval-censored data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 27(1), pages 91-130, January.
    16. Yilong Zhang & Xiaoxia Han & Yongzhao Shao, 2021. "The ROC of Cox proportional hazards cure models with application in cancer studies," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 27(2), pages 195-215, April.
    17. Lu Wang & Pang Du & Hua Liang, 2012. "Two-Component Mixture Cure Rate Model with Spline Estimated Nonparametric Components," Biometrics, The International Biometric Society, vol. 68(3), pages 726-735, September.
    18. Man-Hua Chen & Xingwei Tong, 2020. "Varying coefficient transformation cure models for failure time data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 26(3), pages 518-544, July.
    19. Naveen Narisetty & Roger Koenker, 2019. "Censored quantile regression survival models with a cure proportion," CeMMAP working papers CWP56/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    20. Yuan Wu & Christina D. Chambers & Ronghui Xu, 2019. "Semiparametric sieve maximum likelihood estimation under cure model with partly interval censored and left truncated data for application to spontaneous abortion," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 25(3), pages 507-528, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:19:p:4150-:d:1252508. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.