IDEAS home Printed from https://ideas.repec.org/p/ifs/cemmap/35-15.html
   My bibliography  Save this paper

Variable selection and estimation in high-dimensional models

Author

Listed:
  • Joel L. Horowitz

    () (Institute for Fiscal Studies and Northwestern University)

Abstract

Models with high-dimensional covariates arise frequently in economics and other fields. Often, only a few covariates have important effects on the dependent variable. When this happens, the model is said to be sparse. In applications, however, it is not known which covariates are important and which are not. This paper reviews methods for discriminating between important and unimportant covariates with particular attention given to methods that discriminate correctly with probability approaching 1 as the sample size increases. Methods are available for a wide variety of linear, nonlinear, semiparametric, and nonparametric models. The performance of some of these methods in finite samples is illustrated through Monte Carlo simulations and an empirical example.

Suggested Citation

  • Joel L. Horowitz, 2015. "Variable selection and estimation in high-dimensional models," CeMMAP working papers CWP35/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
  • Handle: RePEc:ifs:cemmap:35/15
    as

    Download full text from publisher

    File URL: https://www.ifs.org.uk/uploads/cemmap/wps/cwp351515.pdf
    Download Restriction: no

    References listed on IDEAS

    as
    1. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    2. Robinson, Peter M, 1988. "Root- N-Consistent Semiparametric Regression," Econometrica, Econometric Society, vol. 56(4), pages 931-954, July.
    3. Nicolai Meinshausen & Peter Bühlmann, 2010. "Stability selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(4), pages 417-473, September.
    4. Kim, Yongdai & Choi, Hosik & Oh, Hee-Seok, 2008. "Smoothly Clipped Absolute Deviation on High Dimensions," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1665-1673.
    5. Wang, Tao & Xu, Pei-Rong & Zhu, Li-Xing, 2012. "Non-convex penalized estimation in high-dimensional models with single-index structure," Journal of Multivariate Analysis, Elsevier, vol. 109(C), pages 221-235.
    6. Jing Wang & Lijian Yang, 2009. "Efficient and fast spline-backfitted kernel smoothing of additive models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 61(3), pages 663-690, September.
    7. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    8. Heng Lian, 2012. "Variable selection in high-dimensional partly linear additive models," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 24(4), pages 825-839, December.
    9. Hansheng Wang & Bo Li & Chenlei Leng, 2009. "Shrinkage tuning parameter selection with a diverging number of parameters," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 671-683, June.
    10. Powell, James L & Stock, James H & Stoker, Thomas M, 1989. "Semiparametric Estimation of Index Coefficients," Econometrica, Econometric Society, vol. 57(6), pages 1403-1430, November.
    11. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    12. Efang Kong & Yingcun Xia, 2007. "Variable selection for the single‐index model," Biometrika, Biometrika Trust, vol. 94(1), pages 217-229.
    13. Hansheng Wang & Runze Li & Chih-Ling Tsai, 2007. "Tuning parameter selectors for the smoothly clipped absolute deviation method," Biometrika, Biometrika Trust, vol. 94(3), pages 553-568.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Knaus, Michael C. & Lechner, Michael & Strittmatter, Anthony, 2017. "Heterogeneous Employment Effects of Job Search Programmes: A Machine Learning Approach," IZA Discussion Papers 10961, Institute for the Study of Labor (IZA).
    2. Joel L. Horowitz & Lars Nesheim, 2018. "Using penalized likelihood to select parameters in a random coefficients multinomial logit model," CeMMAP working papers CWP29/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ifs:cemmap:35/15. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Emma Hyman). General contact details of provider: http://edirc.repec.org/data/cmifsuk.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.