IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1807.11408.html
   My bibliography  Save this paper

Local Linear Forests

Author

Listed:
  • Rina Friedberg
  • Julie Tibshirani
  • Susan Athey
  • Stefan Wager

Abstract

Random forests are a powerful method for non-parametric regression, but are limited in their ability to fit smooth signals, and can show poor predictive performance in the presence of strong, smooth effects. Taking the perspective of random forests as an adaptive kernel method, we pair the forest kernel with a local linear regression adjustment to better capture smoothness. The resulting procedure, local linear forests, enables us to improve on asymptotic rates of convergence for random forests with smooth signals, and provides substantial gains in accuracy on both real and simulated data. We prove a central limit theorem valid under regularity conditions on the forest and smoothness constraints, and propose a computationally efficient construction for confidence intervals. Moving to a causal inference application, we discuss the merits of local regression adjustments for heterogeneous treatment effect estimation, and give an example on a dataset exploring the effect word choice has on attitudes to the social safety net. Last, we include simulation results on real and generated data.

Suggested Citation

  • Rina Friedberg & Julie Tibshirani & Susan Athey & Stefan Wager, 2018. "Local Linear Forests," Papers 1807.11408, arXiv.org, revised Sep 2020.
  • Handle: RePEc:arx:papers:1807.11408
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1807.11408
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Newey, Whitney K., 1994. "Kernel Estimation of Partial Means and a General Variance Estimator," Econometric Theory, Cambridge University Press, vol. 10(2), pages 1-21, June.
    2. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    3. Robinson, Peter M, 1988. "Root- N-Consistent Semiparametric Regression," Econometrica, Econometric Society, vol. 56(4), pages 931-954, July.
    4. Abadie, Alberto & Imbens, Guido W., 2011. "Bias-Corrected Matching Estimators for Average Treatment Effects," Journal of Business & Economic Statistics, American Statistical Association, vol. 29(1), pages 1-11.
    5. Ruoqing Zhu & Donglin Zeng & Michael R. Kosorok, 2015. "Reinforcement Learning Trees," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1770-1784, December.
    6. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, December.
    7. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Susan Athey & Julie Tibshirani & Stefan Wager, 2016. "Generalized Random Forests," Papers 1610.01271, arXiv.org, revised Apr 2018.
    2. Xinkun Nie & Stefan Wager, 2017. "Quasi-Oracle Estimation of Heterogeneous Treatment Effects," Papers 1712.04912, arXiv.org, revised Aug 2020.
    3. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    5. Guido Imbens, 2019. "Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics," NBER Working Papers 26104, National Bureau of Economic Research, Inc.
    6. Bryan S. Graham & Cristine Campos de Xavier Pinto, 2018. "Semiparametrically Efficient Estimation of the Average Linear Regression Function," NBER Working Papers 25234, National Bureau of Economic Research, Inc.
    7. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Oct 2020.
    8. Hema Yoganarasimhan & Ebrahim Barzegary & Abhishek Pani, 2020. "Design and Evaluation of Personalized Free Trials," Papers 2006.13420, arXiv.org.
    9. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    10. Dimitris Bertsimas & Agni Orfanoudaki & Rory B. Weiner, 2020. "Personalized treatment for coronary artery disease patients: a machine learning approach," Health Care Management Science, Springer, vol. 23(4), pages 482-506, December.
    11. Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
    12. Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org.
    13. Sung Jae Jun & Sokbae Lee, 2020. "Causal Inference in Case-Control Studies," Papers 2004.08318, arXiv.org, revised Oct 2020.
    14. Khashayar Khosravi & Greg Lewis & Vasilis Syrgkanis, 2019. "Non-Parametric Inference Adaptive to Intrinsic Dimension," Papers 1901.03719, arXiv.org, revised Jun 2019.
    15. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey, 2016. "Locally robust semiparametric estimation," CeMMAP working papers CWP31/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    16. Gayle, Wayne-Roy & Namoro, Soiliou Daw, 2013. "Estimation of a nonlinear panel data model with semiparametric individual effects," Journal of Econometrics, Elsevier, vol. 175(1), pages 46-59.
    17. Deniz Ozabaci & Daniel Henderson, 2015. "Additive kernel estimates of returns to schooling," Empirical Economics, Springer, vol. 48(1), pages 227-251, February.
    18. Richard K. Crump & V. Joseph Hotz & Guido W. Imbens & Oscar A. Mitnik, 2006. "Moving the Goalposts: Addressing Limited Overlap in the Estimation of Average Treatment Effects by Changing the Estimand," NBER Technical Working Papers 0330, National Bureau of Economic Research, Inc.
    19. White, Halbert & Hong, Yongmiao, 1999. "M-Testing Using Finite and Infinite Dimensional Parameter Estimators," University of California at San Diego, Economics Working Paper Series qt9qz123ng, Department of Economics, UC San Diego.
    20. Hirukawa, Masayuki & Prokhorov, Artem, 2018. "Consistent estimation of linear regression models using matched data," Journal of Econometrics, Elsevier, vol. 203(2), pages 344-358.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1807.11408. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (arXiv administrators). General contact details of provider: http://arxiv.org/ .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.