IDEAS home Printed from https://ideas.repec.org/a/spr/testjl/v31y2022i4d10.1007_s11749-022-00809-y.html
   My bibliography  Save this article

Adaptive bi-level variable selection for multivariate failure time model with a diverging number of covariates

Author

Listed:
  • Kaida Cai

    (Southeast University
    University of Calgary)

  • Hua Shen

    (University of Calgary)

  • Xuewen Lu

    (University of Calgary)

Abstract

In this study we propose an adaptive bi-level variable selection method to analyze multivariate failure time data. In the regression setting, we treat the coefficients corresponding to the same predictor variable as a natural group and then consider variable selection at the group level and individual level simultaneously. By imitating the group variable selection procedure with adaptive bi-level penalty, the proposed variable selection method can select a predictor variable at two different levels allowing different covariate effects for different event types: the group level where the predictor is important to all failure types, and the individual level where the predictor is only important to some failure types. An algorithm based on cycle coordinate descent is developed to carry out the proposed method. Based on the simulation results, our method outperforms the classical penalty methods, especially in removing unimportant variables for different failure types. We obtain the asymptotic oracle properties of the proposed variable selection method in the case of a diverging number of covariates. We construct a generalized cross-validation method for the tuning parameter selection and assess model performance using model errors. We also illustrate the proposed method using a real-life data set.

Suggested Citation

  • Kaida Cai & Hua Shen & Xuewen Lu, 2022. "Adaptive bi-level variable selection for multivariate failure time model with a diverging number of covariates," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(4), pages 968-993, December.
  • Handle: RePEc:spr:testjl:v:31:y:2022:i:4:d:10.1007_s11749-022-00809-y
    DOI: 10.1007/s11749-022-00809-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11749-022-00809-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11749-022-00809-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. P. Tseng, 2001. "Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization," Journal of Optimization Theory and Applications, Springer, vol. 109(3), pages 475-494, June.
    2. Limin X. Clegg & Jianwen Cai & Pranab K. Sen, 1999. "A Marginal Mixed Baseline Hazards Model for Multivariate Failure Time Data," Biometrics, The International Biometric Society, vol. 55(3), pages 805-812, September.
    3. Wang, Hansheng & Leng, Chenlei, 2008. "A note on adaptive group lasso," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5277-5286, August.
    4. Jian Huang & Shuange Ma & Huiliang Xie & Cun-Hui Zhang, 2009. "A group bridge approach for variable selection," Biometrika, Biometrika Trust, vol. 96(2), pages 339-355.
    5. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    6. Jianwen Cai & Jianqing Fan & Runze Li & Haibo Zhou, 2005. "Variable selection for multivariate failure time data," Biometrika, Biometrika Trust, vol. 92(2), pages 303-316, June.
    7. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    8. Lukas Meier & Sara Van De Geer & Peter Bühlmann, 2008. "The group lasso for logistic regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(1), pages 53-71, February.
    9. S. Wang & B. Nan & N. Zhu & J. Zhu, 2009. "Hierarchically penalized Cox regression with grouped variables," Biometrika, Biometrika Trust, vol. 96(2), pages 307-322.
    10. Liu Jicai & Riquan Zhang & Weihua Zhao & Yazhao Lv, 2016. "Variable selection in partially linear hazard regression for multivariate failure time data," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(2), pages 375-394, June.
    11. Niu, Yi & Peng, Yingwei, 2014. "Marginal regression analysis of clustered failure time data with a cure fraction," Journal of Multivariate Analysis, Elsevier, vol. 123(C), pages 129-142.
    12. Liu, Jicai & Zhang, Riquan & Zhao, Weihua & Lv, Yazhao, 2015. "Variable selection in semiparametric hazard regression for multivariate survival data," Journal of Multivariate Analysis, Elsevier, vol. 142(C), pages 26-40.
    13. Benjamin Poignard, 2020. "Asymptotic theory of the adaptive Sparse Group Lasso," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(1), pages 297-328, February.
    14. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    15. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    16. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    17. Zhaozhi Fan & Xiao-Feng Wang, 2009. "Marginal hazards model for multivariate failure time data with auxiliary covariates," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 21(7), pages 771-786.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    2. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    3. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    4. Yanfang Zhang & Chuanhua Wei & Xiaolin Liu, 2022. "Group Logistic Regression Models with l p,q Regularization," Mathematics, MDPI, vol. 10(13), pages 1-15, June.
    5. Young Joo Yoon & Cheolwoo Park & Erik Hofmeister & Sangwook Kang, 2012. "Group variable selection in cardiopulmonary cerebral resuscitation data for veterinary patients," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(7), pages 1605-1621, January.
    6. Yanming Li & Bin Nan & Ji Zhu, 2015. "Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure," Biometrics, The International Biometric Society, vol. 71(2), pages 354-363, June.
    7. Wenyan Zhong & Xuewen Lu & Jingjing Wu, 2021. "Bi-level variable selection in semiparametric transformation models with right-censored data," Computational Statistics, Springer, vol. 36(3), pages 1661-1692, September.
    8. Mingqiu Wang & Guo-Liang Tian, 2016. "Robust group non-convex estimations for high-dimensional partially linear models," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(1), pages 49-67, March.
    9. Yize Zhao & Matthias Chung & Brent A. Johnson & Carlos S. Moreno & Qi Long, 2016. "Hierarchical Feature Selection Incorporating Known and Novel Biological Information: Identifying Genomic Features Related to Prostate Cancer Recurrence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1427-1439, October.
    10. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    11. Takumi Saegusa & Tianzhou Ma & Gang Li & Ying Qing Chen & Mei-Ling Ting Lee, 2020. "Variable Selection in Threshold Regression Model with Applications to HIV Drug Adherence Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(3), pages 376-398, December.
    12. Zanhua Yin, 2020. "Variable selection for sparse logistic regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(7), pages 821-836, October.
    13. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    14. Justin B. Post & Howard D. Bondell, 2013. "Factor Selection and Structural Identification in the Interaction ANOVA Model," Biometrics, The International Biometric Society, vol. 69(1), pages 70-79, March.
    15. Li Yun & O’Connor George T. & Dupuis Josée & Kolaczyk Eric, 2015. "Modeling gene-covariate interactions in sparse regression with group structure for genome-wide association studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 14(3), pages 265-277, June.
    16. Jonathan Boss & Alexander Rix & Yin‐Hsiu Chen & Naveen N. Narisetty & Zhenke Wu & Kelly K. Ferguson & Thomas F. McElrath & John D. Meeker & Bhramar Mukherjee, 2021. "A hierarchical integrative group least absolute shrinkage and selection operator for analyzing environmental mixtures," Environmetrics, John Wiley & Sons, Ltd., vol. 32(8), December.
    17. Mogliani, Matteo & Simoni, Anna, 2021. "Bayesian MIDAS penalized regressions: Estimation, selection, and prediction," Journal of Econometrics, Elsevier, vol. 222(1), pages 833-860.
    18. Haibin Zhang & Juan Wei & Meixia Li & Jie Zhou & Miantao Chao, 2014. "On proximal gradient method for the convex problems regularized with the group reproducing kernel norm," Journal of Global Optimization, Springer, vol. 58(1), pages 169-188, January.
    19. Caiya Zhang & Yanbiao Xiang, 2016. "On the oracle property of adaptive group Lasso in high-dimensional linear models," Statistical Papers, Springer, vol. 57(1), pages 249-265, March.
    20. Arfan Raheen Afzal & Jing Yang & Xuewen Lu, 2021. "Variable selection in partially linear additive hazards model with grouped covariates and a diverging number of parameters," Computational Statistics, Springer, vol. 36(2), pages 829-855, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:testjl:v:31:y:2022:i:4:d:10.1007_s11749-022-00809-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.