IDEAS home Printed from https://ideas.repec.org/p/pri/econom/2020-2.html
   My bibliography  Save this paper

Bias-Aware Inference in Regularized Regression Models

Author

Listed:
  • Timothy B. Armstrong

    (Yale University)

  • Michal Kolesár

    (Princeton University)

  • Soonwoo Kwon

    (Yale University)

Abstract

We consider inference on a regression coefficient under a constraint on the magnitude of the control coefficients. We show that a class of estimators based on an auxiliary regularized regression of the regressor of interest on control variables exactly solves a tradeoff between worst-case bias and variance. We derive "bias-aware" confidence intervals (CIs) based on these estimators, which take into account possible bias when forming the critical value. We show that these estimators and CIs are near-optimal in finite samples for mean squared error and CI length. Our finite-sample results are based on an idealized setting with normal regression errors with known homoskedastic variance, and we provide conditions for asymptotic validity with unknown and possibly heteroskedastic error distribution. Focusing on the case where the constraint on the magnitude of control coefficients is based on an `p norm (p ≥ 1), we derive rates of convergence for optimal estimators and CIs under high-dimensional asymptotics that allow the number of regressors to increase more quickly than the number of observations.

Suggested Citation

  • Timothy B. Armstrong & Michal Kolesár & Soonwoo Kwon, 2020. "Bias-Aware Inference in Regularized Regression Models," Working Papers 2020-2, Princeton University. Economics Department..
  • Handle: RePEc:pri:econom:2020-2
    as

    Download full text from publisher

    File URL: https://www.princeton.edu/~mkolesar/papers/regularized_regression.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Timothy B. Armstrong & Michal Kolesár, 2018. "Optimal Inference in a Class of Regression Models," Econometrica, Econometric Society, vol. 86(2), pages 655-683, March.
    2. Robinson, Peter M, 1988. "Root- N-Consistent Semiparametric Regression," Econometrica, Econometric Society, vol. 56(4), pages 931-954, July.
    3. Armstrong, Timothy B. & Chan, Hock Peng, 2016. "Multiscale adaptive inference on conditional moment inequalities," Journal of Econometrics, Elsevier, vol. 194(1), pages 24-43.
    4. Emily Oster, 2019. "Unobservable Selection and Coefficient Stability: Theory and Evidence," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 37(2), pages 187-204, April.
    5. Timothy B. Armstrong & Michal Kolesár, 2021. "Finite‐Sample Optimal Estimation and Inference on Average Treatment Effects Under Unconfoundedness," Econometrica, Econometric Society, vol. 89(3), pages 1141-1177, May.
    6. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    7. Timothy B. Armstrong & Michal Kolesár, 2021. "Sensitivity analysis using approximate moment condition models," Quantitative Economics, Econometric Society, vol. 12(1), pages 77-108, January.
    8. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    9. Victor Chernozhukov & Denis Chetverikov & Kengo Kato, 2012. "Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors," Papers 1212.6906, arXiv.org, revised Jan 2018.
    10. Karthik Muralidharan & Mauricio Romero & Kaspar Wüthrich, 2019. "Factorial Designs, Model Selection, and (Incorrect) Inference in Randomized Experiments," NBER Working Papers 26562, National Bureau of Economic Research, Inc.
    11. Guido Imbens & Stefan Wager, 2019. "Optimized Regression Discontinuity Designs," The Review of Economics and Statistics, MIT Press, vol. 101(2), pages 264-278, May.
    12. Chenchuan (Mark) Li & Ulrich K. Müller, 2020. "Linear Regression with Many Controls of Limited Explanatory Power," Working Papers 2020-57, Princeton University. Economics Department..
    13. A. Belloni & V. Chernozhukov & L. Wang, 2011. "Square-root lasso: pivotal recovery of sparse signals via conic programming," Biometrika, Biometrika Trust, vol. 98(4), pages 791-806.
    14. Michal Kolesár & Christoph Rothe, 2018. "Inference in Regression Discontinuity Designs with a Discrete Running Variable," American Economic Review, American Economic Association, vol. 108(8), pages 2277-2304, August.
    15. Koohyun Kwon & Soonwoo Kwon, 2020. "Inference in Regression Discontinuity Designs under Monotonicity," Papers 2011.14216, arXiv.org.
    16. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Kaspar Wuthrich & Ying Zhu, 2019. "Omitted variable bias of Lasso-based inference methods: A finite sample analysis," Papers 1903.08704, arXiv.org, revised Sep 2021.
    2. Philipp Ketz & Adam McCloskey, 2021. "Short and Simple Confidence Intervals when the Directions of Some Effects are Known," Papers 2109.08222, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    2. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Uniform post selection inference for LAD regression and other z-estimation problems," CeMMAP working papers CWP74/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers 49/16, Institute for Fiscal Studies.
    4. Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Feb 2024.
    5. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    6. Kaspar Wuthrich & Ying Zhu, 2019. "Omitted variable bias of Lasso-based inference methods: A finite sample analysis," Papers 1903.08704, arXiv.org, revised Sep 2021.
    7. Alexandre Belloni & Mingli Chen & Victor Chernozhukov, 2016. "Quantile Graphical Models: Prediction and Conditional Independence with Applications to Systemic Risk," Papers 1607.00286, arXiv.org, revised Oct 2019.
    8. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2019. "Valid Post-Selection Inference in High-Dimensional Approximately Sparse Quantile Regression Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 749-758, April.
    9. Hansen, Christian & Liao, Yuan, 2019. "The Factor-Lasso And K-Step Bootstrap Approach For Inference In High-Dimensional Economic Applications," Econometric Theory, Cambridge University Press, vol. 35(3), pages 465-509, June.
    10. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    11. Jelena Bradic & Victor Chernozhukov & Whitney K. Newey & Yinchu Zhu, 2019. "Minimax Semiparametric Learning With Approximate Sparsity," Papers 1912.12213, arXiv.org, revised Aug 2022.
    12. Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2020. "lassopack: Model selection and prediction with regularized regression in Stata," Stata Journal, StataCorp LP, vol. 20(1), pages 176-235, March.
    13. Adamek, Robert & Smeekes, Stephan & Wilms, Ines, 2023. "Lasso inference for high-dimensional time series," Journal of Econometrics, Elsevier, vol. 235(2), pages 1114-1143.
    14. Philipp Ketz & Adam McCloskey, 2021. "Short and Simple Confidence Intervals when the Directions of Some Effects are Known," Papers 2109.08222, arXiv.org.
    15. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    16. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Dec 2017.
    17. Alexandre Belloni & Victor Chernozhukov & Ying Wei, 2016. "Post-Selection Inference for Generalized Linear Models With Many Controls," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 606-619, October.
    18. Victor Chernozhukov & Whitney K. Newey & Rahul Singh, 2022. "Automatic Debiased Machine Learning of Causal and Structural Effects," Econometrica, Econometric Society, vol. 90(3), pages 967-1027, May.
    19. Caner, Mehmet & Kock, Anders Bredahl, 2018. "Asymptotically honest confidence regions for high dimensional parameters by the desparsified conservative Lasso," Journal of Econometrics, Elsevier, vol. 203(1), pages 143-168.
    20. Jana Janková & Rajen D. Shah & Peter Bühlmann & Richard J. Samworth, 2020. "Goodness‐of‐fit testing in high dimensional generalized linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(3), pages 773-795, July.

    More about this item

    Keywords

    Regularized regression;

    JEL classification:

    • C20 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pri:econom:2020-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Bobray Bordelon (email available below). General contact details of provider: https://economics.princeton.edu/working-papers/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.