Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem
Author
Abstract
Suggested Citation
DOI: 10.1287/ijds.2022.0019
Download full text from publisher
References listed on IDEAS
- Mammen, Enno & Rothe, Christoph & Schienle, Melanie, 2016.
"Semiparametric Estimation With Generated Covariates,"
Econometric Theory, Cambridge University Press, vol. 32(5), pages 1140-1177, October.
- Mammen, Enno & Rothe, Christoph & Schienle, Melanie, 2011. "Semiparametric Estimation with Generated Covariates," IZA Discussion Papers 6084, Institute of Labor Economics (IZA).
- Mammen, Enno & Rothe, Christoph & Schienle, Melanie, 2016. "Semiparametric estimation with generated covariates," Working Paper Series in Economics 81, Karlsruhe Institute of Technology (KIT), Department of Economics and Management.
- Enno Mammen & Christoph Rothe & Melanie Schienle, 2011. "Semiparametric Estimation with Generated Covariates," SFB 649 Discussion Papers SFB649DP2011-064, Sonderforschungsbereich 649, Humboldt University, Berlin, Germany.
- Enno Mammen & Christoph Rothe & Melanie Schienle, 2014. "Semiparametric Estimation with Generated Covariates," SFB 649 Discussion Papers SFB649DP2014-043, Sonderforschungsbereich 649, Humboldt University, Berlin, Germany.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018.
"Double/debiased machine learning for treatment and structural parameters,"
Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2017. "Double/Debiased Machine Learning for Treatment and Structural Parameters," NBER Working Papers 23564, National Bureau of Economic Research, Inc.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers CWP28/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey & James Robins, 2017. "Double/debiased machine learning for treatment and structural parameters," CeMMAP working papers 28/17, Institute for Fiscal Studies.
- Kuchenhoff, Helmut & Lederer, Wolfgang & Lesaffre, Emmanuel, 2007. "Asymptotic variance estimation for the misclassification SIMEX," Computational Statistics & Data Analysis, Elsevier, vol. 51(12), pages 6197-6211, August.
- Jerry Hausman, 2001. "Mismeasured Variables in Econometric Analysis: Problems from the Right and Problems from the Left," Journal of Economic Perspectives, American Economic Association, vol. 15(4), pages 57-67, Fall.
- Bin Gu & Prabhudev Konana & Rajagopal Raghunathan & Hsuanwei Michelle Chen, 2014. "Research Note —The Allure of Homophily in Social Media: Evidence from Investor Responses on Virtual Communities," Information Systems Research, INFORMS, vol. 25(3), pages 604-617, September.
- Richard W. Blundell & James L. Powell, 2004.
"Endogeneity in Semiparametric Binary Response Models,"
The Review of Economic Studies, Review of Economic Studies Ltd, vol. 71(3), pages 655-679.
- Richard Blundell & James L. Powell, 2001. "Endogeneity in semiparametric binary response models," CeMMAP working papers CWP05/01, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Richard Blundell & James L. Powell, 2001. "Endogeneity in semiparametric binary response models," CeMMAP working papers 05/01, Institute for Fiscal Studies.
- Susanne M. Schennach, 2016. "Recent Advances in the Measurement Error Literature," Annual Review of Economics, Annual Reviews, vol. 8(1), pages 341-377, October.
- McKinley Blackburn & David Neumark, 1992.
"Unobserved Ability, Efficiency Wages, and Interindustry Wage Differentials,"
The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 107(4), pages 1421-1436.
- McKinley Blackburn & David Neumark, 1991. "Unobserved Ability, Efficiency Wages, and Interindustry Wage Differentials," NBER Working Papers 3857, National Bureau of Economic Research, Inc.
- Blaser, Rico & Fryzlewicz, Piotr, 2016. "Random rotation ensembles," LSE Research Online Documents on Economics 62182, London School of Economics and Political Science, LSE Library.
- Newey, Whitney K., 1984. "A method of moments interpretation of sequential estimators," Economics Letters, Elsevier, vol. 14(2-3), pages 201-206.
- Hausman, Jerry, 2015.
"Specification tests in econometrics,"
Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 38(2), pages 112-134.
- Hausman, Jerry A, 1978. "Specification Tests in Econometrics," Econometrica, Econometric Society, vol. 46(6), pages 1251-1271, November.
- J. A. Hausman, 1976. "Specification Tests in Econometrics," Working papers 185, Massachusetts Institute of Technology (MIT), Department of Economics.
- Angrist, Joshua D & Krueger, Alan B, 1995. "Split-Sample Instrumental Variables Estimates of the Return to Schooling," Journal of Business & Economic Statistics, American Statistical Association, vol. 13(2), pages 225-235, April.
- Yingyao Hu & Susanne M. Schennach, 2008. "Instrumental Variable Treatment of Nonclassical Measurement Error Models," Econometrica, Econometric Society, vol. 76(1), pages 195-216, January.
- Tawei Wang & Karthik N. Kannan & Jackie Rees Ulmer, 2013. "The Association Between the Disclosure and the Realization of Information Security Risk Factors," Information Systems Research, INFORMS, vol. 24(2), pages 201-218, June.
- A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012.
"Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain,"
Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
- Alexandre Belloni & Daniel Chen & Victor Chernozhukov & Christian Hansen, 2010. "Sparse Models and Methods for Optimal Instruments with an Application to Eminent Domain," Papers 1010.4345, arXiv.org, revised Apr 2015.
- Alexandre Belloni & D. Chen & Victor Chernozhukov & Christian Hansen, 2010. "Sparse models and methods for optimal instruments with an application to eminent domain," CeMMAP working papers CWP31/10, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Rohit Aggarwal & Ram Gopal & Alok Gupta & Harpreet Singh, 2012. "Putting Money Where the Mouths Are: The Relation Between Venture Financing and Electronic Word-of-Mouth," Information Systems Research, INFORMS, vol. 23(3-part-2), pages 976-992, September.
- Gérard Biau & Erwan Scornet, 2016. "A random forest guided tour," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(2), pages 197-227, June.
- David Roodman, 2009.
"A Note on the Theme of Too Many Instruments,"
Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 71(1), pages 135-158, February.
- David Roodman, 2007. "A Note on the Theme of Too Many Instruments," Working Papers 125, Center for Global Development.
- Anindya Ghose & Panagiotis G. Ipeirotis & Beibei Li, 2012. "Designing Ranking Systems for Hotels on Travel Search Engines by Mining User-Generated and Crowdsourced Content," Marketing Science, INFORMS, vol. 31(3), pages 493-520, May.
- Khim-Yong Goh & Cheng-Suang Heng & Zhijie Lin, 2013. "Social Media Brand Community and Consumer Behavior: Quantifying the Relative Impact of User- and Marketer-Generated Content," Information Systems Research, INFORMS, vol. 24(1), pages 88-107, March.
- Antonio Moreno & Christian Terwiesch, 2014. "Doing Business with Strangers: Reputation in Online Service Marketplaces," Information Systems Research, INFORMS, vol. 25(4), pages 865-886, December.
- Hausman, J. A. & Newey, W. K. & Powell, J. L., 1995.
"Nonlinear errors in variables Estimation of some Engel curves,"
Journal of Econometrics, Elsevier, vol. 65(1), pages 205-233, January.
- J. A. Hausman & W. K. Newey & J. L. Powel, 1988. "Nonlinear Errors in Variables: Estimation of Some Engel Curves," Working papers 504, Massachusetts Institute of Technology (MIT), Department of Economics.
- Murphy, Kevin M & Topel, Robert H, 2002.
"Estimation and Inference in Two-Step Econometric Models,"
Journal of Business & Economic Statistics, American Statistical Association, vol. 20(1), pages 88-97, January.
- Murphy, Kevin M & Topel, Robert H, 1985. "Estimation and Inference in Two-Step Econometric Models," Journal of Business & Economic Statistics, American Statistical Association, vol. 3(4), pages 370-379, October.
- Fong, Christian & Tyler, Matthew, 2021. "Machine Learning Predictions as Regression Covariates," Political Analysis, Cambridge University Press, vol. 29(4), pages 467-484, October.
- Buse, A, 1992. "The Bias of Instrumental Variable Estimators," Econometrica, Econometric Society, vol. 60(1), pages 173-180, January.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey, 2017. "Double/Debiased/Neyman Machine Learning of Treatment Effects," American Economic Review, American Economic Association, vol. 107(5), pages 261-265, May.
- Yingda Lu & Kinshuk Jerath & Param Vir Singh, 2013. "The Emergence of Opinion Leaders in a Networked Online Community: A Dyadic Model with Time Dynamics and a Heuristic for Fast Estimation," Management Science, INFORMS, vol. 59(8), pages 1783-1799, August.
- Gérard Biau & Erwan Scornet, 2016. "Rejoinder on: A random forest guided tour," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(2), pages 264-268, June.
- Mochen Yang & Gediminas Adomavicius & Gordon Burtch & Yuqing Rena, 2018. "Mind the Gap: Accounting for Measurement Error and Misclassification in Variables Generated via Data Mining," Information Systems Research, INFORMS, vol. 29(1), pages 4-24, March.
- Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Dec 2017.
- Stefan Sperlich, 2009. "A note on non-parametric estimation with predicted variables," Econometrics Journal, Royal Economic Society, vol. 12(2), pages 382-395, July.
- Helmut Küchenhoff & Samuel M. Mwalili & Emmanuel Lesaffre, 2006. "A General Method for Dealing with Misclassification in Regression: The Misclassification SIMEX," Biometrics, The International Biometric Society, vol. 62(1), pages 85-96, March.
- Peter Ebbes & Michel Wedel & Ulf Böckenholt & Ton Steerneman, 2005. "Solving and Testing for Regressor-Error (in)Dependence When no Instrumental Variables are Available: With New Evidence for the Effect of Education on Income," Quantitative Marketing and Economics (QME), Springer, vol. 3(4), pages 365-392, December.
- Lingsheng Meng & Binzhen Wu & Zhaoguo Zhan, 2016. "Linear regression with an estimated regressor: applications to aggregate indicators of economic development," Empirical Economics, Springer, vol. 50(2), pages 299-316, March.
- Michael P. Murray, 2006. "Avoiding Invalid Instruments and Coping with Weak Instruments," Journal of Economic Perspectives, American Economic Association, vol. 20(4), pages 111-132, Fall.
- Bin Gu & Prabhudev Konana & Balaji Rajagopalan & Hsuan-Wei Michelle Chen, 2007. "Competition Among Virtual Communities and User Valuation: The Case of Investing-Related Communities," Information Systems Research, INFORMS, vol. 18(1), pages 68-85, March.
- Peter Ebbes & Michel Wedel & Ulf Böckenholt, 2009. "Frugal IV alternatives to identify the parameter for an endogenous regressor," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 24(3), pages 446-468, April.
- Oxley, Les & McAleer, Michael, 1993. "Econometric Issues in Macroeconomic Models with Generated Regressors," Journal of Economic Surveys, Wiley Blackwell, vol. 7(1), pages 1-40.
- Pagan, Adrian, 1984. "Econometric Issues in the Analysis of Regressions with Generated Regressors," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 25(1), pages 221-247, February.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Mochen Yang & Edward McFowland III & Gordon Burtch & Gediminas Adomavicius, 2020. "Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem," Papers 2012.10790, arXiv.org.
- Gordon Burtch & Edward McFowland III & Mochen Yang & Gediminas Adomavicius, 2023. "EnsembleIV: Creating Instrumental Variables from Ensemble Learners for Robust Statistical Inference," Papers 2303.02820, arXiv.org.
- Mochen Yang & Gediminas Adomavicius & Gordon Burtch & Yuqing Rena, 2018. "Mind the Gap: Accounting for Measurement Error and Misclassification in Variables Generated via Data Mining," Information Systems Research, INFORMS, vol. 29(1), pages 4-24, March.
- Mengke Qiao & Ke-Wei Huang, 2021. "Correcting Misclassification Bias in Regression Models with Variables Generated via Data Mining," Information Systems Research, INFORMS, vol. 32(2), pages 462-480, June.
- Mammen, Enno & Rothe, Christoph & Schienle, Melanie, 2016.
"Semiparametric Estimation With Generated Covariates,"
Econometric Theory, Cambridge University Press, vol. 32(5), pages 1140-1177, October.
- Enno Mammen & Christoph Rothe & Melanie Schienle, 2011. "Semiparametric Estimation with Generated Covariates," SFB 649 Discussion Papers SFB649DP2011-064, Sonderforschungsbereich 649, Humboldt University, Berlin, Germany.
- Mammen, Enno & Rothe, Christoph & Schienle, Melanie, 2011. "Semiparametric Estimation with Generated Covariates," IZA Discussion Papers 6084, Institute of Labor Economics (IZA).
- Mammen, Enno & Rothe, Christoph & Schienle, Melanie, 2016. "Semiparametric estimation with generated covariates," Working Paper Series in Economics 81, Karlsruhe Institute of Technology (KIT), Department of Economics and Management.
- Enno Mammen & Christoph Rothe & Melanie Schienle, 2014. "Semiparametric Estimation with Generated Covariates," SFB 649 Discussion Papers SFB649DP2014-043, Sonderforschungsbereich 649, Humboldt University, Berlin, Germany.
- Jiaming Mao & Jingzhi Xu, 2020. "Ensemble Learning with Statistical and Structural Models," Papers 2006.05308, arXiv.org.
- DUFOUR, Jean-Marie & JASIAK, Joanna, 1998.
"Finite-Sample Inference Methods for Simultaneous Equations and Models with Unobserved and Generated Regressors,"
Cahiers de recherche
9812, Universite de Montreal, Departement de sciences economiques.
- Jean-Marie Dufour & Joanna Jasiak, 2000. "Finite Sample Inference Methods for Simultaneous Equations and Models with Unobserved and Generated Regressors," Econometric Society World Congress 2000 Contributed Papers 1536, Econometric Society.
- Jean-Marie Dufour & Joann Jasiak, 2000. "Finite Sample Inference Methods for Simultaneous Equations and Models with Unobserved and Generated Regressors," CIRANO Working Papers 2000s-13, CIRANO.
- Elliott Ash & Daniel L. Chen & Sergio Galletta, 2022.
"Measuring Judicial Sentiment: Methods and Application to US Circuit Courts,"
Economica, London School of Economics and Political Science, vol. 89(354), pages 362-376, April.
- Elliott Ash & Daniel L. Chen & Sergio Galletta, 2022. "Measuring Judicial Sentiment: Methods and Application to US Circuit Courts," Post-Print hal-03597819, HAL.
- Valente, Marica, 2023.
"Policy evaluation of waste pricing programs using heterogeneous causal effect estimation,"
Journal of Environmental Economics and Management, Elsevier, vol. 117(C).
- Marica Valente, 2020. "Policy evaluation of waste pricing programs using heterogeneous causal effect estimation," Papers 2010.01105, arXiv.org, revised Nov 2022.
- Marica Valente, 2021. "Policy Evaluation of Waste Pricing Programs Using Heterogeneous Causal Effect Estimation," Discussion Papers of DIW Berlin 1980, DIW Berlin, German Institute for Economic Research.
- Aristide Houndetoungan & Abdoul Haki Maoude, 2024. "Inference for Two-Stage Extremum Estimators," Papers 2402.05030, arXiv.org.
- Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022.
"Locally Robust Semiparametric Estimation,"
Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
- Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey, 2016. "Locally robust semiparametric estimation," CeMMAP working papers CWP31/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2016. "Locally Robust Semiparametric Estimation," Papers 1608.00033, arXiv.org, revised Aug 2020.
- Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2018. "Locally robust semiparametric estimation," CeMMAP working papers CWP30/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey, 2016. "Locally robust semiparametric estimation," CeMMAP working papers 31/16, Institute for Fiscal Studies.
- Jinyong Hahn & Jerry Hausman, 2021. "Problems with the Control Variable Approach in Achieving Unbiased Estimates in Nonlinear Models in the Presence of Many Instruments," Journal of Quantitative Economics, Springer;The Indian Econometric Society (TIES), vol. 19(1), pages 39-58, December.
- Patrick Saart & Jiti Gao & Nam Hyun Kim, 2014.
"Semiparametric methods in nonlinear time series analysis: a selective review,"
Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 26(1), pages 141-169, March.
- Patrick Saart & Jiti Gao, 2012. "Semiparametric Methods in Nonlinear Time Series Analysis: A Selective Review," Monash Econometrics and Business Statistics Working Papers 21/12, Monash University, Department of Econometrics and Business Statistics.
- Guilhem Bascle, 2008. "Controlling for endogeneity with instrumental variables in strategic management research," Post-Print hal-00576795, HAL.
- Sander Gerritsen & Mark Kattenberg & Sonny Kuijpers, 2019. "The impact of age at arrival on education and mental health," CPB Discussion Paper 389.rdf, CPB Netherlands Bureau for Economic Policy Analysis.
- Sander Gerritsen & Mark Kattenberg & Sonny Kuijpers, 2019. "The impact of age at arrival on education and mental health," CPB Discussion Paper 389, CPB Netherlands Bureau for Economic Policy Analysis.
- Daniel Wilhelm, 2018.
"Testing for the presence of measurement error,"
CeMMAP working papers
CWP45/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Daniel Wilhelm, 2019. "Testing for the presence of measurement error," CeMMAP working papers CWP48/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Daniel Wilhelm, 2019. "Testing for the Presence of Measurement Error," Economic Statistics Centre of Excellence (ESCoE) Discussion Papers ESCoE DP-2019-18, Economic Statistics Centre of Excellence (ESCoE).
- Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Mar 2024.
- Jayeeta Bhattacharya, 2020. "Quantile regression with generated dependent variable and covariates," Papers 2012.13614, arXiv.org.
- Helmut Wasserbacher & Martin Spindler, 2022. "Machine learning for financial forecasting, planning and analysis: recent developments and pitfalls," Digital Finance, Springer, vol. 4(1), pages 63-88, March.
More about this item
Keywords
machine learning; econometric analysis; instrumental variable; random forest; causal inference;All these keywords.
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijds:v:1:y:2022:i:2:p:138-155. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.