IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0279435.html
   My bibliography  Save this article

Statistical approaches to identifying significant differences in predictive performance between machine learning and classical statistical models for survival data

Author

Listed:
  • Justine B Nasejje
  • Albert Whata
  • Charles Chimedza

Abstract

Research that seeks to compare two predictive models requires a thorough statistical approach to draw valid inferences about comparisons between the performance of the two models. Researchers present estimates of model performance with little evidence on whether they reflect true differences in model performance. In this study, we apply two statistical tests, that is, the 5 × 2-fold cv paired t-test, and the combined 5 × 2-fold cv F-test to provide statistical evidence on differences in predictive performance between the Fine-Gray (FG) and random survival forest (RSF) models for competing risks. These models are trained on different scenarios of low-dimensional simulated survival data to determine whether the differences in their predictive performance that exist are indeed significant. Each simulation was repeated one hundred times on ten different seeds. The results indicate that the RSF model is superior in predictive performance in the presence of complex relationships (quadratic and interactions) between the outcome and its predictors. The two statistical tests show that the differences in performance are significant in quadratic simulation but not significant in interaction simulations. The study has also revealed that the FG model is superior in predictive performance in linear simulations and its differences in predictive performance compared to the RSF model are significant. The combined 5 × 2-fold cv F-test has lower type I error rates compared to the 5 × 2-fold cv paired t-test.

Suggested Citation

  • Justine B Nasejje & Albert Whata & Charles Chimedza, 2022. "Statistical approaches to identifying significant differences in predictive performance between machine learning and classical statistical models for survival data," PLOS ONE, Public Library of Science, vol. 17(12), pages 1-22, December.
  • Handle: RePEc:plo:pone00:0279435
    DOI: 10.1371/journal.pone.0279435
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0279435
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0279435&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0279435?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Vincenzo Coviello & May Boggess, 2004. "Cumulative incidence estimation in the presence of competing risks," Stata Journal, StataCorp LLC, vol. 4(2), pages 103-112, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Diana Hechavarría & Charles Matthews & Paul Reynolds, 2016. "Does start-up financing influence start-up speed? Evidence from the panel study of entrepreneurial dynamics," Small Business Economics, Springer, vol. 46(1), pages 137-167, January.
    2. Davis, Elizabeth E. & Krafft, Caroline & Forry, Nicole D., 2017. "Understanding churn: Predictors of reentry among families who leave the child care subsidy program in Maryland," Children and Youth Services Review, Elsevier, vol. 77(C), pages 34-45.
    3. Céspedes, Nikita & Gutiérrez, Ana Paola & Belapatiño, Vanessa, 2013. "Determinantes de la duración del desempleo en una economía con alta informalidad," Working Papers 2013-022, Banco Central de Reserva del Perú.
    4. Michael J. Crowther & Paul C. Lambert, 2012. "Simulating complex survival data," Stata Journal, StataCorp LLC, vol. 12(4), pages 674-687, December.
    5. Erzsébet Bukodi, 2012. "Serial Cohabitation among Men in Britain: Does Work History Matter? [Cohabitations successives des hommes en Angleterre : l’histoire professionnelle joue-t-elle un rôle ?]," European Journal of Population, Springer;European Association for Population Studies, vol. 28(4), pages 441-466, November.
    6. Lowell Hargens, 2019. "Incidence of first-marriage divorce among women in the 1979 panel of the National Longitudinal Survey of Youth," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 40(52), pages 1529-1536.
    7. Aedin Doris & Donal O'Neill & Olive Sweetman, 2017. "Does Reducing Unemployment Benefits during a Recession Reduce Youth Unemployment? Evidence from a 50% cut in Unemployment Assistance," Economics Department Working Paper Series n279-17.pdf, Department of Economics, National University of Ireland - Maynooth.
    8. Shelley Clark & Dana Hamplová, 2013. "Single Motherhood and Child Mortality in Sub-Saharan Africa: A Life Course Perspective," Demography, Springer;Population Association of America (PAA), vol. 50(5), pages 1521-1549, October.
    9. Dorsett, Richard & Lucchino, Paolo, 2018. "Young people's labour market transitions: The role of early experiences," Labour Economics, Elsevier, vol. 54(C), pages 29-46.
    10. Ognjen Obućina, 2016. "Partner Choice in Sweden Following a Failed Intermarriage," European Journal of Population, Springer;European Association for Population Studies, vol. 32(4), pages 511-542, October.
    11. Dimiter Philipov & Aiva Jasilioniene, 2007. "Union formation and fertility in Bulgaria and Russia: a life table description of recent trends," MPIDR Working Papers WP-2007-005, Max Planck Institute for Demographic Research, Rostock, Germany.
    12. Saha, U.R., 2012. "Econometric models of child mortality dynamics in rural Bangladesh," Other publications TiSEM f734b639-9696-480e-96f0-8, Tilburg University, School of Economics and Management.
    13. Belapatiño, Vanessa & Céspedes, Nikita & Gutierrez, Ana Paola, 2014. "La duración del desempleo en Lima Metropolitana," Revista Estudios Económicos, Banco Central de Reserva del Perú, issue 27, pages 67-80.
    14. Varga, Júlia, 2016. "Hova lettek az orvosok?. Az orvosok külföldre vándorlása és pályaelhagyása Magyarországon, 2003-2011 [Where have all the doctors gone?. Migration and attrition of physicians and dentists in Hungary," Közgazdasági Szemle (Economic Review - monthly of the Hungarian Academy of Sciences), Közgazdasági Szemle Alapítvány (Economic Review Foundation), vol. 0(1), pages 1-26.
    15. Dimiter Philipov & Aiva Jasilioniene, 2008. "Union formation and fertility in Bulgaria and Russia: A life table description of recent trends," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 19(62), pages 2057-2114.
    16. Amparo González-Ferrer & Clara Cortina & Teresa Castro Martín & Ognjen Obućina, 2018. "Mixed marriages between immigrants and natives in Spain: The gendered effect of marriage market constraints," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 39(1), pages 1-32.
    17. Marika Jalovaara & Anneli Miettinen, 2013. "Does his paycheck also matter?," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 28(31), pages 881-916.
    18. Saha, U.R. & van Soest, A.H.O. & Bijwaard, G.E., 2012. "Cause-specific Neonatal Deaths : Levels, Trend and Determinants in Rural Bangladesh, 1987-2005," Other publications TiSEM a51b9cf0-74dd-4bc2-bd6f-d, Tilburg University, School of Economics and Management.
    19. Sally R. Hinchliffe & Paul C. Lambert, 2013. "Extending the flexible parametric survival model for competing risks," Stata Journal, StataCorp LLC, vol. 13(2), pages 344-355, June.
    20. Pavlova, Elitsa & Signore, Simone, 2021. "The European venture capital landscape: An EIF perspective. Volume VI: The impact of VC on the exit and innovation outcomes of EIF-backed start-ups," EIF Working Paper Series 2021/70, European Investment Fund (EIF).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0279435. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.