IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/26584.html
   My bibliography  Save this paper

Machine Labor

Author

Listed:
  • Joshua Angrist
  • Brigham Frandsen

Abstract

Machine learning (ML) is mostly a predictive enterprise, while the questions of interest to labor economists are mostly causal. In pursuit of causal effects, however, ML may be useful for automated selection of ordinary least squares (OLS) control variables. We illustrate the utility of ML for regression-based causal inference by using lasso to select control variables for estimates of effects of college characteristics on wages. ML also seems relevant for an instrumental variables (IV) first stage, since the bias of two-stage least squares can be said to be due to over-fitting. Our investigation shows, however, that while ML-based instrument selection can improve on conventional 2SLS estimates, split-sample IV, jackknife IV, and LIML estimators do better. In some scenarios, the performance of ML-augmented IV estimators is degraded by pretest bias. In others, nonlinear ML for covariate control creates artificial exclusion restrictions that generate spurious findings. ML does better at choosing control variables for models identified by conditional independence assumptions than at choosing instrumental variables for models identified by exclusion restrictions.

Suggested Citation

  • Joshua Angrist & Brigham Frandsen, 2019. "Machine Labor," NBER Working Papers 26584, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:26584
    Note: CH DEV LS PE
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w26584.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2020. "lassopack: Model selection and prediction with regularized regression in Stata," Stata Journal, StataCorp LP, vol. 20(1), pages 176-235, March.
    2. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments," American Economic Review, American Economic Association, vol. 105(5), pages 486-490, May.
    3. Hansen, Christian & Kozbur, Damian, 2014. "Instrumental variables estimation with many weak instruments using regularized JIVE," Journal of Econometrics, Elsevier, vol. 182(2), pages 290-308.
    4. Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
    5. Morimune, Kimio, 1983. "Approximate Distributions of k-Class Estimators When the Degree of Overidentifiability Is Large Compared with the Sample Size," Econometrica, Econometric Society, vol. 51(3), pages 821-841, May.
    6. Angrist, Joshua D & Evans, William N, 1998. "Children and Their Parents' Labor Supply: Evidence from Exogenous Variation in Family Size," American Economic Review, American Economic Association, vol. 88(3), pages 450-477, June.
    7. Goller, Daniel & Lechner, Michael & Moczall, Andreas & Wolff, Joachim, 2020. "Does the estimation of the propensity score by machine learning improve matching estimation? The case of Germany's programmes for long term unemployed," Labour Economics, Elsevier, vol. 65(C).
    8. Carrasco, Marine & Tchuente, Guy, 2015. "Regularized LIML for many instruments," Journal of Econometrics, Elsevier, vol. 186(2), pages 427-442.
    9. Carrasco, Marine, 2012. "A regularization approach to the many instruments problem," Journal of Econometrics, Elsevier, vol. 170(2), pages 383-398.
    10. Luiz M. Cruz & Marcelo J. Moreira, 2005. "On the Validity of Econometric Techniques with Weak Instruments: Inference on Returns to Education Using Compulsory School Attendance Laws," Journal of Human Resources, University of Wisconsin Press, vol. 40(2).
    11. Bekker, Paul A, 1994. "Alternative Approximations to the Distributions of Instrumental Variable Estimators," Econometrica, Econometric Society, vol. 62(3), pages 657-681, May.
    12. Chao, John C. & Swanson, Norman R. & Hausman, Jerry A. & Newey, Whitney K. & Woutersen, Tiemen, 2012. "Asymptotic Distribution Of Jive In A Heteroskedastic Iv Regression With Many Instruments," Econometric Theory, Cambridge University Press, vol. 28(1), pages 42-86, February.
    13. Wilbur Townsend, 2017. "ELASTICREGRESS: Stata module to perform elastic net regression, lasso regression, ridge regression," Statistical Software Components S458397, Boston College Department of Economics, revised 16 Apr 2018.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dennis Lim & Wenjie Wang & Yichong Zhang, 2022. "A Conditional Linear Combination Test with Many Weak Instruments," Papers 2207.11137, arXiv.org, revised Apr 2023.
    2. Marine Carrasco & Guy Tchuente, 2016. "Efficient Estimation with Many Weak Instruments Using Regularization Techniques," Econometric Reviews, Taylor & Francis Journals, vol. 35(8-10), pages 1609-1637, December.
    3. Carrasco, Marine & Tchuente, Guy, 2015. "Regularized LIML for many instruments," Journal of Econometrics, Elsevier, vol. 186(2), pages 427-442.
    4. Anna Mikusheva & Liyang Sun, 2023. "Weak Identification with Many Instruments," Papers 2308.09535, arXiv.org, revised Jan 2024.
    5. Guy Tchuente, 2019. "Weak Identification and Estimation of Social Interaction Models," Papers 1902.06143, arXiv.org.
    6. Dakyung Seong, 2022. "Binary response model with many weak instruments," Papers 2201.04811, arXiv.org, revised May 2023.
    7. Guy Tchuente, 2016. "Estimation of social interaction models using regularization," Studies in Economics 1607, School of Economics, University of Kent.
    8. Matsushita, Yukitoshi & Otsu, Taisuke, 2022. "A jackknife Lagrange multiplier test with many weak instruments," LSE Research Online Documents on Economics 116392, London School of Economics and Political Science, LSE Library.
    9. Yoonseok Lee & Yu Zhou, 2015. "Averaged Instrumental Variables Estimators," Center for Policy Research Working Papers 180, Center for Policy Research, Maxwell School, Syracuse University.
    10. Murray Michael P., 2017. "Linear Model IV Estimation When Instruments Are Many or Weak," Journal of Econometric Methods, De Gruyter, vol. 6(1), pages 1-22, January.
    11. Sølvsten, Mikkel, 2020. "Robust estimation with many instruments," Journal of Econometrics, Elsevier, vol. 214(2), pages 495-512.
    12. Kolesár, Michal, 2018. "Minimum distance approach to inference with many instruments," Journal of Econometrics, Elsevier, vol. 204(1), pages 86-100.
    13. Alena Skolkova, 2023. "Instrumental Variable Estimation with Many Instruments Using Elastic-Net IV," CERGE-EI Working Papers wp759, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
    14. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments," American Economic Review, American Economic Association, vol. 105(5), pages 486-490, May.
    15. Michal Kolesár & Raj Chetty & John Friedman & Edward Glaeser & Guido W. Imbens, 2015. "Identification and Inference With Many Invalid Instruments," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 33(4), pages 474-484, October.
    16. Eric Gautier & Christiern Rose, 2022. "Fast, Robust Inference for Linear Instrumental Variables Models using Self-Normalized Moments," Papers 2211.02249, arXiv.org, revised Nov 2022.
    17. Nam-Hyun Kim & Winfried Pohlmeier, 2015. "A Regularization Approach to Biased Two-Stage Least Squares Estimation," Working Paper series 15-22, Rimini Centre for Economic Analysis.
    18. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach," Annual Review of Economics, Annual Reviews, vol. 7(1), pages 649-688, August.
    19. Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2020. "lassopack: Model selection and prediction with regularized regression in Stata," Stata Journal, StataCorp LP, vol. 20(1), pages 176-235, March.
    20. Lei Bill Wang, 2023. "Estimating overidentified linear models with heteroskedasticity and outliers," Papers 2305.17615, arXiv.org, revised Apr 2024.

    More about this item

    JEL classification:

    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models
    • C26 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Instrumental Variables (IV) Estimation
    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • J01 - Labor and Demographic Economics - - General - - - Labor Economics: General
    • J08 - Labor and Demographic Economics - - General - - - Labor Economics Policies

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:26584. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.