IDEAS home Printed from https://ideas.repec.org/a/spr/lifeda/v23y2017i2d10.1007_s10985-016-9359-y.html
   My bibliography  Save this article

Variable selection in discrete survival models including heterogeneity

Author

Listed:
  • Andreas Groll

    (Ludwig-Maximilians-Universität München)

  • Gerhard Tutz

    (Ludwig-Maximilians-Universität München)

Abstract

Several variable selection procedures are available for continuous time-to-event data. However, if time is measured in a discrete way and therefore many ties occur models for continuous time are inadequate. We propose penalized likelihood methods that perform efficient variable selection in discrete survival modeling with explicit modeling of the heterogeneity in the population. The method is based on a combination of ridge and lasso type penalties that are tailored to the case of discrete survival. The performance is studied in simulation studies and an application to the birth of the first child.

Suggested Citation

  • Andreas Groll & Gerhard Tutz, 2017. "Variable selection in discrete survival models including heterogeneity," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(2), pages 305-338, April.
  • Handle: RePEc:spr:lifeda:v:23:y:2017:i:2:d:10.1007_s10985-016-9359-y
    DOI: 10.1007/s10985-016-9359-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10985-016-9359-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10985-016-9359-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
    2. Hartzel, Jonathan & Liu, I-Ming & Agresti, Alan, 2001. "Describing heterogeneous effects in stratified ordinal contingency tables, with application to multi-center clinical trials," Computational Statistics & Data Analysis, Elsevier, vol. 35(4), pages 429-449, February.
    3. Pötscher, Benedikt M. & Leeb, Hannes, 2009. "On the distribution of penalized maximum likelihood estimators: The LASSO, SCAD, and thresholding," Journal of Multivariate Analysis, Elsevier, vol. 100(9), pages 2065-2082, October.
    4. Ham, John C & Rea, Samuel A, Jr, 1987. "Unemployment Insurance and Male Unemployment Duration in Canada," Journal of Labor Economics, University of Chicago Press, vol. 5(3), pages 325-353, July.
    5. Van den Berg, Gerard J., 2001. "Duration models: specification, identification and multiple durations," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 5, chapter 55, pages 3381-3460, Elsevier.
    6. Nicoletti, Cheti & Rondinelli, Concetta, 2010. "The (mis)specification of discrete duration models with unobserved heterogeneity: A Monte Carlo study," Journal of Econometrics, Elsevier, vol. 159(1), pages 1-13, November.
    7. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    8. Göran Kauermann & Gerhard Tutz & Josef Brüderl, 2005. "The survival of newly founded firms: a case‐study into varying‐coefficient models," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 168(1), pages 145-158, January.
    9. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    10. Leeb, Hannes & Pötscher, Benedikt M., 2005. "Model Selection And Inference: Facts And Fiction," Econometric Theory, Cambridge University Press, vol. 21(1), pages 21-59, February.
    11. Simon, Noah & Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2011. "Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 39(i05).
    12. Baker, Michael & Melino, Angelo, 2000. "Duration dependence and nonparametric heterogeneity: A Monte Carlo study," Journal of Econometrics, Elsevier, vol. 96(2), pages 357-393, June.
    13. Heckman, James J. & Singer, Burton, 1984. "Econometric duration analysis," Journal of Econometrics, Elsevier, vol. 24(1-2), pages 63-132.
    14. Rondeau, Virginie & Marzroui, Yassin & Gonzalez, Juan R., 2012. "frailtypack: An R Package for the Analysis of Correlated Survival Data with Frailty Models Using Penalized Likelihood Estimation or Parametrical Estimation," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 47(i04).
    15. Fahrmeir, Ludwig & Kneib, Thomas, 2011. "Bayesian Smoothing and Regression for Longitudinal, Spatial and Event History Data," OUP Catalogue, Oxford University Press, number 9780199533022.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Marie-Therese Puth & Gerhard Tutz & Nils Heim & Eva Münster & Matthias Schmid & Moritz Berger, 2020. "Tree-based modeling of time-varying coefficients in discrete time-to-event models," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 26(3), pages 545-572, July.
    2. Xiao-Dong Zhou & Yun-Juan Wang & Rong-Xian Yue, 2021. "Optimal designs for discrete-time survival models with random effects," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 27(2), pages 300-332, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hess, Wolfgang & Persson, Maria, 2010. "The Duration of Trade Revisited. Continuous-Time vs. Discrete-Time Hazards," Working Papers 2010:1, Lund University, Department of Economics.
    2. Hess, Wolfgang & Persson, Maria, 2009. "Survival and Death in International Trade - Discrete-Time Durations of EU Imports," Working Papers 2009:12, Lund University, Department of Economics.
    3. Xianyi Wu & Xian Zhou, 2019. "On Hodges’ superefficiency and merits of oracle property in model selection," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 71(5), pages 1093-1119, October.
    4. Wolfgang Hess & Maria Persson, 2012. "The duration of trade revisited," Empirical Economics, Springer, vol. 43(3), pages 1083-1107, December.
    5. Christopher J Greenwood & George J Youssef & Primrose Letcher & Jacqui A Macdonald & Lauryn J Hagg & Ann Sanson & Jenn Mcintosh & Delyse M Hutchinson & John W Toumbourou & Matthew Fuller-Tyszkiewicz &, 2020. "A comparison of penalised regression methods for informing the selection of predictive markers," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-14, November.
    6. Anders Bredahl Kock, 2012. "On the Oracle Property of the Adaptive Lasso in Stationary and Nonstationary Autoregressions," CREATES Research Papers 2012-05, Department of Economics and Business Economics, Aarhus University.
    7. Damien Rousselière, 2019. "A Flexible Approach to Age Dependence in Organizational Mortality: Comparing the Life Duration for Cooperative and Non-Cooperative Enterprises Using a Bayesian Generalized Additive Discrete Time Survi," Journal of Quantitative Economics, Springer;The Indian Econometric Society (TIES), vol. 17(4), pages 829-855, December.
    8. Hausman, Jerry A. & Woutersen, Tiemen, 2014. "Estimating a semi-parametric duration model without specifying heterogeneity," Journal of Econometrics, Elsevier, vol. 178(P1), pages 114-131.
    9. Zhixuan Fu & Shuangge Ma & Haiqun Lin & Chirag R. Parikh & Bingqing Zhou, 2017. "Penalized Variable Selection for Multi-center Competing Risks Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(2), pages 379-405, December.
    10. Carvalho, Carlos & Masini, Ricardo & Medeiros, Marcelo C., 2018. "ArCo: An artificial counterfactual approach for high-dimensional panel time-series data," Journal of Econometrics, Elsevier, vol. 207(2), pages 352-380.
    11. Jaap H. Abbring & Gerard J. Van Den Berg, 2007. "The unobserved heterogeneity distribution in duration analysis," Biometrika, Biometrika Trust, vol. 94(1), pages 87-99.
    12. Finnie, Ross & Gray, David, 2002. "Earnings dynamics in Canada: an econometric analysis," Labour Economics, Elsevier, vol. 9(6), pages 763-800, December.
    13. Claude Renaux & Laura Buzdugan & Markus Kalisch & Peter Bühlmann, 2020. "Hierarchical inference for genome-wide association studies: a view on methodology with software," Computational Statistics, Springer, vol. 35(1), pages 1-40, March.
    14. Emmanuel O. Ogundimu, 2022. "Regularization and variable selection in Heckman selection model," Statistical Papers, Springer, vol. 63(2), pages 421-439, April.
    15. Nicoletti, Cheti & Rondinelli, Concetta, 2010. "The (mis)specification of discrete duration models with unobserved heterogeneity: A Monte Carlo study," Journal of Econometrics, Elsevier, vol. 159(1), pages 1-13, November.
    16. Max H. Farrell, 2013. "Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations," Papers 1309.4686, arXiv.org, revised Feb 2018.
    17. Hui Xiao & Yiguo Sun, 2019. "On Tuning Parameter Selection in Model Selection and Model Averaging: A Monte Carlo Study," JRFM, MDPI, vol. 12(3), pages 1-16, June.
    18. Tae-Hwy Lee & Zhou Xi & Ru Zhang, 2013. "Testing for Neglected Nonlinearity Using Regularized Artificial Neural Networks," Working Papers 201422, University of California at Riverside, Department of Economics, revised Apr 2012.
    19. Van den Berg, Gerard J., 2001. "Duration models: specification, identification and multiple durations," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 5, chapter 55, pages 3381-3460, Elsevier.
    20. Alain Hecq & Luca Margaritella & Stephan Smeekes, 2023. "Granger Causality Testing in High-Dimensional VARs: A Post-Double-Selection Procedure," Journal of Financial Econometrics, Oxford University Press, vol. 21(3), pages 915-958.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:lifeda:v:23:y:2017:i:2:d:10.1007_s10985-016-9359-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.