IDEAS home Printed from https://ideas.repec.org/a/tpr/restat/v101y2019i5p743-762.html
   My bibliography  Save this article

Choosing Among Regularized Estimators in Empirical Economics: The Risk of Machine Learning

Author

Listed:
  • Alberto Abadie

    (MIT)

  • Maximilian Kasy

    (Harvard University)

Abstract

Many settings in empirical economics involve estimation of a large number of parameters. In such settings, methods that combine regularized estimation and data-driven choices of regularization parameters are useful. We provide guidance to applied researchers on the choice between regularized estimators and data-driven selection of regularization parameters. We characterize the risk and relative performance of regularized estimators as a function of the data-generating process and show that data-driven choices of regularization parameters yield estimators with risk uniformly close to the risk attained under the optimal (unfeasible) choice of regularization parameters. We illustrate using examples from empirical economics.

Suggested Citation

  • Alberto Abadie & Maximilian Kasy, 2019. "Choosing Among Regularized Estimators in Empirical Economics: The Risk of Machine Learning," The Review of Economics and Statistics, MIT Press, vol. 101(5), pages 743-762, December.
  • Handle: RePEc:tpr:restat:v:101:y:2019:i:5:p:743-762
    as

    Download full text from publisher

    File URL: http://www.mitpressjournals.org/doi/pdf/10.1162/rest_a_00812
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Alexandre Belloni & Victor Chernozhukov, 2011. "High Dimensional Sparse Econometric Models: An Introduction," Papers 1106.5242, arXiv.org, revised Sep 2011.
    2. Raj Chetty & Nathaniel Hendren, 2018. "The Impacts of Neighborhoods on Intergenerational Mobility I: Childhood Exposure Effects," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(3), pages 1107-1162.
    3. Roger Koenker & Ivan Mizera, 2014. "Convex Optimization, Shape Constraints, Compound Decisions, and Empirical Bayes Rules," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 674-685, June.
    4. Newey, Whitney K., 1997. "Convergence rates and asymptotic normality for series estimators," Journal of Econometrics, Elsevier, vol. 79(1), pages 147-168, July.
    5. James H. Stock & Mark W. Watson, 2012. "Generalized Shrinkage Methods for Forecasting Using Many Predictors," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 30(4), pages 481-493, June.
    6. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    7. Raj Chetty & John N. Friedman & Jonah E. Rockoff, 2014. "Measuring the Impacts of Teachers II: Teacher Value-Added and Student Outcomes in Adulthood," American Economic Review, American Economic Association, vol. 104(9), pages 2633-2679, September.
    8. Alan B. Krueger, 1999. "Experimental Estimates of Education Production Functions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 114(2), pages 497-532.
    9. Joshua Angrist & Victor Chernozhukov & Iván Fernández-Val, 2006. "Quantile Regression under Misspecification, with an Application to the U.S. Wage Structure," Econometrica, Econometric Society, vol. 74(2), pages 539-563, March.
    10. Leeb, Hannes & Pötscher, Benedikt M., 2006. "Performance Limits For Estimators Of The Risk Or Distribution Of Shrinkage-Type Estimators, And Some General Lower Risk-Bound Results," Econometric Theory, Cambridge University Press, vol. 22(1), pages 69-97, February.
    11. Raj Chetty & Nathaniel Hendren, 2018. "The Impacts of Neighborhoods on Intergenerational Mobility II: County-Level Estimates," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(3), pages 1163-1228.
    12. David S. Abrams & Marianne Bertrand & Sendhil Mullainathan, 2012. "Do Judges Vary in Their Treatment of Race?," The Journal of Legal Studies, University of Chicago Press, vol. 41(2), pages 347-383.
    13. Koenker, Roger & Mizera, Ivan, 2014. "Convex Optimization in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 60(i05).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Timothy B. Armstrong & Michal Kolesár & Mikkel Plagborg‐Møller, 2022. "Robust Empirical Bayes Confidence Intervals," Econometrica, Econometric Society, vol. 90(6), pages 2567-2602, November.
    2. Julien Chevallier & Dominique Guégan & Stéphane Goutte, 2021. "Is It Possible to Forecast the Price of Bitcoin?," Forecasting, MDPI, vol. 3(2), pages 1-44, May.
    3. Philippe Goulet Coulombe & Maxime Leroux & Dalibor Stevanovic & Stéphane Surprenant, 2022. "How is machine learning useful for macroeconomic forecasting?," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 920-964, August.
    4. Domenico Giannone & Michele Lenza & Giorgio E. Primiceri, 2021. "Economic Predictions With Big Data: The Illusion of Sparsity," Econometrica, Econometric Society, vol. 89(5), pages 2409-2437, September.
    5. Mike Gilraine & Jiaying Gu & Robert McMillan, 2022. "A Nonparametric Approach for Studying Teacher Impacts," Working Papers tecipa-716, University of Toronto, Department of Economics.
    6. Giuseppe De Luca & Jan R. Magnus & Franco Peracchi, 2022. "Asymptotic properties of the weighted-average least squares (WALS) estimator," EIEF Working Papers Series 2203, Einaudi Institute for Economics and Finance (EIEF), revised Mar 2022.
    7. Michael Gilraine & Jiaying Gu & Robert McMillan, 2020. "A New Method for Estimating Teacher Value-Added," NBER Working Papers 27094, National Bureau of Economic Research, Inc.
    8. Michael Gilraine & Jiaying Gu & Robert McMillan, 2021. "A Nonparametric Method for Estimating Teacher Value-Added," Working Papers tecipa-689, University of Toronto, Department of Economics.
    9. Akbas, Ozan E. & Betz, Frank & Gattini, Luca, 2023. "Quantifying credit gaps using survey data on discouraged borrowers," EIB Working Papers 2023/06, European Investment Bank (EIB).
    10. Isaiah Andrews & Jesse M. Shapiro, 2021. "A Model of Scientific Communication," Econometrica, Econometric Society, vol. 89(5), pages 2117-2142, September.
    11. Altındağ, Onur & O'Connell, Stephen D. & Şaşmaz, Aytuğ & Balcıoğlu, Zeynep & Cadoni, Paola & Jerneck, Matilda & Foong, Aimee Kunze, 2021. "Targeting humanitarian aid using administrative data: Model design and validation," Journal of Development Economics, Elsevier, vol. 148(C).
    12. Tatiana de Macedo Nogueira Lima, 2022. "Documento de Trabalho 03/2022 - Aprendizado de máquina e antitruste," Documentos de Trabalho 2022030, Conselho Administrativo de Defesa Econômica (Cade), Departamento de Estudos Econômicos.
    13. Luca Coraggio & Marco Pagano & Annalisa Scognamiglio & Joacim Tåg, 2022. "JAQ of All Trades: Job Mismatch, Firm Productivity and Managerial Quality," EIEF Working Papers Series 2205, Einaudi Institute for Economics and Finance (EIEF), revised Mar 2022.
    14. Georges, Christophre & Pereira, Javier, 2021. "Market stability with machine learning agents," Journal of Economic Dynamics and Control, Elsevier, vol. 122(C).
    15. Jan Niederreiter, 2023. "Broadening Economics in the Era of Artificial Intelligence and Experimental Evidence," Italian Economic Journal: A Continuation of Rivista Italiana degli Economisti and Giornale degli Economisti, Springer;Società Italiana degli Economisti (Italian Economic Association), vol. 9(1), pages 265-294, March.
    16. Timothy B. Armstrong & Michal Koles'ar & Mikkel Plagborg-M{o}ller, 2020. "Robust Empirical Bayes Confidence Intervals," Papers 2004.03448, arXiv.org, revised May 2022.
    17. Brian Asquith & Judith K. Hellerstein & Mark J. Kutzbach & David Neumark, 2021. "Social capital determinants and labor market networks," Journal of Regional Science, Wiley Blackwell, vol. 61(1), pages 212-260, January.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Timothy B. Armstrong & Michal Kolesár & Mikkel Plagborg‐Møller, 2022. "Robust Empirical Bayes Confidence Intervals," Econometrica, Econometric Society, vol. 90(6), pages 2567-2602, November.
    2. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    3. Michael Geruso & Timothy J. Layton & Jacob Wallace, 2023. "What Difference Does a Health Plan Make? Evidence from Random Plan Assignment in Medicaid," American Economic Journal: Applied Economics, American Economic Association, vol. 15(3), pages 341-379, July.
    4. Johannes S. Kunz & Kevin E. Staub & Rainer Winkelmann, 2021. "Predicting individual effects in fixed effects panel probit models," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(3), pages 1109-1145, July.
    5. Kunz, J.S.; & Staub, K.E.; & Winkelmann, R.;, 2018. "Predicting fixed effects in panel probit models," Health, Econometrics and Data Group (HEDG) Working Papers 18/23, HEDG, c/o Department of Economics, University of York.
    6. Raffaella Giacomini & Sokbae Lee & Silvia Sarpietro, 2023. "A Robust Method for Microforecasting and Estimation of Random Effects," Working Paper Series WP 2023-26, Federal Reserve Bank of Chicago.
    7. Jo Blanden & Matthias Doepke & Jan Stuhler, 2022. "Education inequality," CEP Discussion Papers dp1849, Centre for Economic Performance, LSE.
    8. Simon Fan & Yu Pang & Pierre Pestieau, 2023. "Nature versus Nurture in Social Mobility Under Private and Public Education Systems," Public Finance Review, , vol. 51(1), pages 132-167, January.
    9. Valentin Verdier, 2020. "Estimation and Inference for Linear Models with Two-Way Fixed Effects and Sparsely Matched Data," The Review of Economics and Statistics, MIT Press, vol. 102(1), pages 1-16, March.
    10. Hanushek, Eric A. & Jacobs, Babs & Schwerdt, Guido & Van der Velden, Rolf & Vermeulen, Stan & Wiederhold, Simon, 2021. "The Intergenerational Transmission of Cognitive Skills: An Investigation of the Causal Impact of Families on Student Outcomes," IZA Discussion Papers 14854, Institute of Labor Economics (IZA).
    11. Eric A. Hanushek & Babs Jacobs & Guido Schwerdt & Rolf van der Velden & Stan Vermeulen & Simon Wiederhold, 2021. "Where Do STEM Graduates Stem From? The Intergenerational Transmission of Comparative Skill Advantages," CESifo Working Paper Series 9388, CESifo.
    12. Patrick Kline & Christopher Walters, 2021. "Reasonable Doubt: Experimental Detection of Job‐Level Employment Discrimination," Econometrica, Econometric Society, vol. 89(2), pages 765-792, March.
    13. Lleras-Muney, Adriana & Price, Joseph & Yue, Dahai, 2022. "The association between educational attainment and longevity using individual-level data from the 1940 census," Journal of Health Economics, Elsevier, vol. 84(C).
    14. Jiaying Gu & Roger Koenker, 2020. "Invidious Comparisons: Ranking and Selection as Compound Decisions," Papers 2012.12550, arXiv.org, revised Sep 2021.
    15. M. Keith Chen & Kareem Haggag & Devin G. Pope & Ryne Rohla, 2019. "Racial Disparities in Voting Wait Times: Evidence from Smartphone Data," NBER Working Papers 26487, National Bureau of Economic Research, Inc.
    16. Stéphane Bonhomme & Martin Weidner, 2019. "Posterior average effects," CeMMAP working papers CWP43/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    17. Godøy, Anna & Huitfeldt, Ingrid, 2020. "Regional variation in health care utilization and mortality," Journal of Health Economics, Elsevier, vol. 71(C).
    18. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2011. "Inference for high-dimensional sparse econometric models," CeMMAP working papers CWP41/11, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    19. Fox, Jeremy T. & Kim, Kyoo il & Yang, Chenyu, 2016. "A simple nonparametric approach to estimating the distribution of random coefficients in structural models," Journal of Econometrics, Elsevier, vol. 195(2), pages 236-254.
    20. Committee, Nobel Prize, 2021. "Answering causal questions using observational data," Nobel Prize in Economics documents 2021-2, Nobel Prize Committee.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tpr:restat:v:101:y:2019:i:5:p:743-762. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Kelly McDougall (email available below). General contact details of provider: https://direct.mit.edu/journals .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.