IDEAS home Printed from https://ideas.repec.org/p/boc/usug18/12.html
   My bibliography  Save this paper

LASSOPACK and PDSLASSO: Prediction, model selection and causal inference with regularized regression

Author

Listed:
  • Achim Ahrens

    () (Economic and Social Research Institute, Dublin)

  • Christian B Hansen

    (University of Chicago Booth School of Business)

  • Mark E Schaffer

    (Heriot-Watt University)

Abstract

The field of machine learning is attracting increasing attention among social scientists and economists. At the same time, Stata offers to date only a very limited set of machine learning tools. This one-hour session introduces two Stata packages, lassopack and pdslasso, which implement regularized regression methods, including but not limited to the lasso (Tibshirani 1996 Journal of the Royal Statistical Society Series B), for Stata. The packages include features intended for prediction, model selection and causal inference, and are thus applicable in a wide range of settings. The commands allow for high-dimensional models, where the number of regressors may be large or even exceed the number of observations under the assumption of sparsity. The package lassopack implements lasso, square-root lasso (Belloni et al. 2011 Biometrika; 2014 Annals of Statistics), elastic net (Zou and Hastie 2005 Journal of the Royal Statistical Society Series B), ridge regression (Hoerl and Kennard 1970 Technometrics), adaptive lasso (Zou 2006 Journal of the American Statistical Association) and post-estimation OLS. These methods rely on tuning parameters, which determine the degree and type of penalization. lassopack supports three approaches for selecting these tuning parameters: information criteria (implemented in lasso2), K-fold and h-step ahead rolling cross-validation (cvlasso), and theory-driven penalization (rlasso) due to Belloni et al. (2012 Econometrica). In addition, rlasso implements the Chernozhukov et al. (2013 Annals of Statistics) sup-score test of joint significance of the regressors.

Suggested Citation

  • Achim Ahrens & Christian B Hansen & Mark E Schaffer, 2018. "LASSOPACK and PDSLASSO: Prediction, model selection and causal inference with regularized regression," London Stata Conference 2018 12, Stata Users Group.
  • Handle: RePEc:boc:usug18:12
    as

    Download full text from publisher

    File URL: http://repec.org/usug2018/2018_AhrensSchaffer.pdf
    Download Restriction: no

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:boc:usug18:12. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Christopher F Baum). General contact details of provider: http://edirc.repec.org/data/stataea.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.