IDEAS home Printed from
MyIDEAS: Log in (now much improved!) to save this paper

The Factor-Lasso and K-Step Bootstrap Approach for Inference in High-Dimensional Economic Applications

Listed author(s):
  • Christian Hansen


    (Booth School of Business, University of Chicago)

  • Yuan Liao


    (Department of Economics, Rutgers University)

Registered author(s):

    We consider inference about coefficients on a small number of variables of interest in a linear panel data model with additive unobserved individual and time specific effects and a large number of additional time-varying confounding variables. We allow the number of these additional confounding variables to be larger than the sample size, and suppose that, in addition to unrestricted time and individual specific effects, these confounding variables are generated by a small number of common factors and high-dimensional weakly-dependent disturbances. We allow that both the factors and the disturbances are related to the outcome variable and other variables of interest. To make informative inference feasible, we impose that the contribution of the part of the confounding variables not captured by time specific effects, individual specific effects, or the common factors can be captured by a relatively small number of terms whose identities are unknown. Within this framework, we provide a convenient computational algorithm based on factor extraction followed by lasso regression for inference about parameters of interest and show that the resulting procedure has good asymptotic properties. We also provide a simple k-step bootstrap procedure that may be used to construct inferential statements about parameters of interest and prove its asymptotic validity. The proposed bootstrap may be of substantive independent interest outside of the present context as the proposed bootstrap may readily be adapted to other contexts involving inference after lasso variable selection and the proof of its validity requires some new technical arguments. We also provide simulation evidence about performance of our procedure and illustrate its use in two empirical applications.

    If you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.

    File URL:
    Download Restriction: no

    Paper provided by Rutgers University, Department of Economics in its series Departmental Working Papers with number 201610.

    in new window

    Length: 81 pages
    Date of creation: 29 Nov 2016
    Handle: RePEc:rut:rutres:201610
    Contact details of provider: Postal:
    New Jersey Hall - 75 Hamilton Street, New Brunswick, NJ 08901-1248

    Phone: (732) 932-7363
    Fax: (732) 932-7416
    Web page:

    More information through EDIRC

    References listed on IDEAS
    Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:

    in new window

    1. Choi, In, 2012. "Efficient Estimation Of Factor Models," Econometric Theory, Cambridge University Press, vol. 28(02), pages 274-308, April.
    2. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    3. Su, Liangjun & Chen, Qihui, 2013. "Testing Homogeneity In Panel Data Models With Interactive Fixed Effects," Econometric Theory, Cambridge University Press, vol. 29(06), pages 1079-1135, December.
    4. Hyungsik Roger Moon & Martin Weidner, 2015. "Linear Regression for Panel With Unknown Number of Factors as Interactive Fixed Effects," Econometrica, Econometric Society, vol. 83(4), pages 1543-1579, 07.
    5. Stéphane Bonhomme & Elena Manresa, 2015. "Grouped Patterns of Heterogeneity in Panel Data," Econometrica, Econometric Society, vol. 83(3), pages 1147-1184, 05.
    6. Chernozhukov, Victor & Hansen, Christian, 2008. "The reduced form: A simple approach to inference with weak instruments," Economics Letters, Elsevier, vol. 100(1), pages 68-71, July.
    7. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    8. Donald W. K. Andrews, 2002. "Higher-Order Improvements of a Computationally Attractive "k"-Step Bootstrap for Extremum Estimators," Econometrica, Econometric Society, vol. 70(1), pages 119-162, January.
    9. Seung C. Ahn & Alex R. Horenstein, 2013. "Eigenvalue Ratio Test for the Number of Factors," Econometrica, Econometric Society, vol. 81(3), pages 1203-1227, 05.
    10. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    11. Hansen, Christian B., 2007. "Asymptotic properties of a robust variance matrix estimator for panel data when T is large," Journal of Econometrics, Elsevier, vol. 141(2), pages 597-620, December.
    12. Eric Gautier & Alexandre Tsybakov, 2011. "High-Dimensional Instrumental Variables Regression and Confidence Sets," Working Papers 2011-13, Centre de Recherche en Economie et Statistique.
    13. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," Review of Economic Studies, Oxford University Press, vol. 81(2), pages 608-650.
    14. Marianne Bertrand & Esther Duflo & Sendhil Mullainathan, 2004. "How Much Should We Trust Differences-In-Differences Estimates?," The Quarterly Journal of Economics, Oxford University Press, vol. 119(1), pages 249-275.
    15. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    16. Jushan Bai, 2003. "Inferential Theory for Factor Models of Large Dimensions," Econometrica, Econometric Society, vol. 71(1), pages 135-171, January.
    17. Jushan Bai & Serena Ng, 2006. "Confidence Intervals for Diffusion Index Forecasts and Inference for Factor-Augmented Regressions," Econometrica, Econometric Society, vol. 74(4), pages 1133-1150, 07.
    18. Daron Acemoglu & Simon Johnson & James A. Robinson, 2001. "The Colonial Origins of Comparative Development: An Empirical Investigation," American Economic Review, American Economic Association, vol. 91(5), pages 1369-1401, December.
    19. M. Hashem Pesaran, 2006. "Estimation and Inference in Large Heterogeneous Panels with a Multifactor Error Structure," Econometrica, Econometric Society, vol. 74(4), pages 967-1012, 07.
    20. Alexandre Belloni & Victor Chernozhukov & Ivan Fernandez-Val & Christian Hansen, 2013. "Program evaluation with high-dimensional data," CeMMAP working papers CWP77/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    21. P. Richard Hahn & Carlos M. Carvalho & Sayan Mukherjee, 2013. "Partial Factor Modeling: Predictor-Dependent Shrinkage for Linear Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(503), pages 999-1008, September.
    22. Arellano, M, 1987. "Computing Robust Standard Errors for Within-Groups Estimators," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 49(4), pages 431-434, November.
    23. Chatterjee, A. & Lahiri, S. N., 2011. "Bootstrapping Lasso Estimators," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 608-625.
    24. Cook, Philip J. & Ludwig, Jens, 2006. "The social costs of gun ownership," Journal of Public Economics, Elsevier, vol. 90(1-2), pages 379-391, January.
    25. Victor Chernozhukov & Denis Chetverikov & Kengo Kato, 2013. "Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors," CeMMAP working papers CWP76/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    26. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    27. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, 01.
    Full references (including those not matched with items on IDEAS)

    This item is not listed on Wikipedia, on a reading list or among the top items on IDEAS.

    When requesting a correction, please mention this item's handle: RePEc:rut:rutres:201610. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ()

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If references are entirely missing, you can add them using this form.

    If the full references list an item that is present in RePEc, but the system did not link to it, you can help with this form.

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    This information is provided to you by IDEAS at the Research Division of the Federal Reserve Bank of St. Louis using RePEc data.