IDEAS home Printed from https://ideas.repec.org/a/spr/aistmt/v69y2017i5d10.1007_s10463-016-0571-z.html
   My bibliography  Save this article

A doubly sparse approach for group variable selection

Author

Listed:
  • Sunghoon Kwon

    (Konkuk University)

  • Jeongyoun Ahn

    (University of Georgia)

  • Woncheol Jang

    (Seoul National University)

  • Sangin Lee

    (University of Texas Southwestern Medical Center)

  • Yongdai Kim

    (Seoul National University)

Abstract

We propose a new penalty called the doubly sparse (DS) penalty for variable selection in high-dimensional linear regression models when the covariates are naturally grouped. An advantage of the DS penalty over other penalties is that it provides a clear way of controlling sparsity between and within groups, separately. We prove that there exists a unique global minimizer of the DS penalized sum of squares of residuals and show how the DS penalty selects groups and variables within selected groups, even when the number of groups exceeds the sample size. An efficient optimization algorithm is introduced also. Results from simulation studies and real data analysis show that the DS penalty outperforms other existing penalties with finite samples.

Suggested Citation

  • Sunghoon Kwon & Jeongyoun Ahn & Woncheol Jang & Sangin Lee & Yongdai Kim, 2017. "A doubly sparse approach for group variable selection," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(5), pages 997-1025, October.
  • Handle: RePEc:spr:aistmt:v:69:y:2017:i:5:d:10.1007_s10463-016-0571-z
    DOI: 10.1007/s10463-016-0571-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10463-016-0571-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10463-016-0571-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    2. Patrick Breheny, 2015. "The group exponential lasso for bi‐level variable selection," Biometrics, The International Biometric Society, vol. 71(3), pages 731-740, September.
    3. Yongdai Kim & Sunghoon Kwon, 2012. "Global optimality of nonconvex penalized estimators," Biometrika, Biometrika Trust, vol. 99(2), pages 315-325.
    4. Kim, Yongdai & Choi, Hosik & Oh, Hee-Seok, 2008. "Smoothly Clipped Absolute Deviation on High Dimensions," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1665-1673.
    5. Jian Huang & Shuange Ma & Huiliang Xie & Cun-Hui Zhang, 2009. "A group bridge approach for variable selection," Biometrika, Biometrika Trust, vol. 96(2), pages 339-355.
    6. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    7. Hansheng Wang & Bo Li & Chenlei Leng, 2009. "Shrinkage tuning parameter selection with a diverging number of parameters," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 671-683, June.
    8. Kwon, Sunghoon & Lee, Sangin & Kim, Yongdai, 2015. "Moderately clipped LASSO," Computational Statistics & Data Analysis, Elsevier, vol. 92(C), pages 53-67.
    9. Hansheng Wang & Runze Li & Chih-Ling Tsai, 2007. "Tuning parameter selectors for the smoothly clipped absolute deviation method," Biometrika, Biometrika Trust, vol. 94(3), pages 553-568.
    10. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yanxin Wang & Qibin Fan & Li Zhu, 2018. "Variable selection and estimation using a continuous approximation to the $$L_0$$ L 0 penalty," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(1), pages 191-214, February.
    2. Gaorong Li & Liugen Xue & Heng Lian, 2012. "SCAD-penalised generalised additive models with non-polynomial dimensionality," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 24(3), pages 681-697.
    3. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    4. Fei Jin & Lung-fei Lee, 2018. "Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices," Econometrics, MDPI, vol. 6(1), pages 1-24, February.
    5. Jeon, Jong-June & Kwon, Sunghoon & Choi, Hosik, 2017. "Homogeneity detection for the high-dimensional generalized linear model," Computational Statistics & Data Analysis, Elsevier, vol. 114(C), pages 61-74.
    6. Jin, Fei & Lee, Lung-fei, 2018. "Irregular N2SLS and LASSO estimation of the matrix exponential spatial specification model," Journal of Econometrics, Elsevier, vol. 206(2), pages 336-358.
    7. Joel L. Horowitz, 2015. "Variable selection and estimation in high-dimensional models," CeMMAP working papers 35/15, Institute for Fiscal Studies.
    8. Wenyan Zhong & Xuewen Lu & Jingjing Wu, 2021. "Bi-level variable selection in semiparametric transformation models with right-censored data," Computational Statistics, Springer, vol. 36(3), pages 1661-1692, September.
    9. Joel L. Horowitz, 2015. "Variable selection and estimation in high‐dimensional models," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 48(2), pages 389-407, May.
    10. Hirose, Kei & Tateishi, Shohei & Konishi, Sadanori, 2013. "Tuning parameter selection in sparse regression modeling," Computational Statistics & Data Analysis, Elsevier, vol. 59(C), pages 28-40.
    11. Karsten Schweikert, 2022. "Oracle Efficient Estimation of Structural Breaks in Cointegrating Regressions," Journal of Time Series Analysis, Wiley Blackwell, vol. 43(1), pages 83-104, January.
    12. Lian, Heng & Li, Jianbo & Tang, Xingyu, 2014. "SCAD-penalized regression in additive partially linear proportional hazards models with an ultra-high-dimensional linear part," Journal of Multivariate Analysis, Elsevier, vol. 125(C), pages 50-64.
    13. Lian, Heng, 2014. "Semiparametric Bayesian information criterion for model selection in ultra-high dimensional additive models," Journal of Multivariate Analysis, Elsevier, vol. 123(C), pages 304-310.
    14. Mingqiu Wang & Guo-Liang Tian, 2016. "Robust group non-convex estimations for high-dimensional partially linear models," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(1), pages 49-67, March.
    15. Karsten Schweikert, 2020. "Oracle Efficient Estimation of Structural Breaks in Cointegrating Regressions," Papers 2001.07949, arXiv.org, revised Apr 2021.
    16. Lee, Sangin & Kim, Yongdai & Kwon, Sunghoon, 2012. "Quadratic approximation for nonconvex penalized estimations with a diverging number of parameters," Statistics & Probability Letters, Elsevier, vol. 82(9), pages 1710-1717.
    17. Joel L. Horowitz, 2015. "Variable selection and estimation in high-dimensional models," Canadian Journal of Economics, Canadian Economics Association, vol. 48(2), pages 389-407, May.
    18. Joel L. Horowitz, 2015. "Variable selection and estimation in high-dimensional models," CeMMAP working papers CWP35/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    19. Yongdai Kim & Jong-June Jeon & Sangmi Han, 2016. "A Necessary Condition for the Strong Oracle Property," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(2), pages 610-624, June.
    20. Peng, Heng & Lu, Ying, 2012. "Model selection in linear mixed effect models," Journal of Multivariate Analysis, Elsevier, vol. 109(C), pages 109-129.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:aistmt:v:69:y:2017:i:5:d:10.1007_s10463-016-0571-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.