IDEAS home Printed from https://ideas.repec.org/a/eee/econom/v211y2019i1p61-82.html
   My bibliography  Save this article

A panel quantile approach to attrition bias in Big Data: Evidence from a randomized experiment

Author

Listed:
  • Harding, Matthew
  • Lamarche, Carlos

Abstract

This paper introduces a quantile regression estimator for panel data models with individual heterogeneity and attrition. The method is motivated by the fact that attrition bias is often encountered in Big Data applications. For example, many users sign-up for the latest program but few remain active users several months later, making the evaluation of such interventions inherently very challenging. Building on earlier work by Hausman and Wise (1979), we provide a simple identification strategy that leads to a two-step estimation procedure. In the first step, the coefficients of interest in the selection equation are consistently estimated using parametric or nonparametric methods. In the second step, standard panel quantile methods are employed on a subset of weighted observations. The estimator is computationally easy to implement in Big Data applications with a large number of subjects. We investigate the conditions under which the parameter estimator is asymptotically Gaussian and we carry out a series of Monte Carlo simulations to investigate the finite sample properties of the estimator. Lastly, using a simulation exercise, we apply the method to the evaluation of a recent Time-of-Day electricity pricing experiment inspired by the work of Aigner and Hausman (1980).

Suggested Citation

  • Harding, Matthew & Lamarche, Carlos, 2019. "A panel quantile approach to attrition bias in Big Data: Evidence from a randomized experiment," Journal of Econometrics, Elsevier, vol. 211(1), pages 61-82.
  • Handle: RePEc:eee:econom:v:211:y:2019:i:1:p:61-82
    DOI: 10.1016/j.jeconom.2018.12.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0304407618302355
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jeconom.2018.12.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Farshid Vahid & Pushkar Maitra, 2006. "The effect of household characteristics on living standards in South Africa 1993-1998: a quantile regression analysis with sample attrition," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 21(7), pages 999-1018.
    2. Wang, Huixia Judy & Wang, Lan, 2009. "Locally Weighted Censored Quantile Regression," Journal of the American Statistical Association, American Statistical Association, vol. 104(487), pages 1117-1128.
    3. Hofleitner, Aude & Herring, Ryan & Bayen, Alexandre, 2012. "Arterial travel time forecast with streaming data: A hybrid approach of flow modeling and machine learning," Transportation Research Part B: Methodological, Elsevier, vol. 46(9), pages 1097-1122.
    4. Alberto Abadie & Joshua Angrist & Guido Imbens, 2002. "Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings," Econometrica, Econometric Society, vol. 70(1), pages 91-117, January.
    5. Chernozhukov, Victor & Fernández-Val, Iván & Hoderlein, Stefan & Holzmann, Hajo & Newey, Whitney, 2015. "Nonparametric identification in panels using quantiles," Journal of Econometrics, Elsevier, vol. 188(2), pages 378-392.
    6. John Fitzgerald & Peter Gottschalk & Robert Moffitt, 1998. "An Analysis of Sample Attrition in Panel Data: The Michigan Panel Study of Income Dynamics," Journal of Human Resources, University of Wisconsin Press, vol. 33(2), pages 251-299.
    7. Ramanathan, Ramu & Engle, Robert & Granger, Clive W. J. & Vahid-Araghi, Farshid & Brace, Casey, 1997. "Shorte-run forecasts of electricity loads and peaks," International Journal of Forecasting, Elsevier, vol. 13(2), pages 161-174, June.
    8. Arellano, Manuel & Honore, Bo, 2001. "Panel data models: some recent developments," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 5, chapter 53, pages 3229-3296, Elsevier.
    9. Ridder, Geert, 1992. "An empirical evaluation of some models for non-random attrition in panel data," Structural Change and Economic Dynamics, Elsevier, vol. 3(2), pages 337-355, December.
    10. Bhattacharya, Debopam, 2008. "Inference in panel data models under attrition caused by unobservables," Journal of Econometrics, Elsevier, vol. 144(2), pages 430-446, June.
    11. Ivan Fernandez-Val, 2005. "Bias Correction in Panel Data Models with Individual Specific Parameters," Boston University - Department of Economics - Working Papers Series WP2005-041, Boston University - Department of Economics.
    12. Ivan A. Canay, 2011. "A simple approach to quantile regression for panel data," Econometrics Journal, Royal Economic Society, vol. 14(3), pages 368-386, October.
    13. Koenker, Roger, 2004. "Quantile regression for longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 91(1), pages 74-89, October.
    14. Ekaterini Kyriazidou, 1997. "Estimation of a Panel Data Sample Selection Model," Econometrica, Econometric Society, vol. 65(6), pages 1335-1364, November.
    15. Koenker,Roger, 2005. "Quantile Regression," Cambridge Books, Cambridge University Press, number 9780521845731, January.
    16. Roy J. & Lin X., 2002. "Analysis of Multivariate Longitudinal Outcomes With Nonignorable Dropouts and Missing Covariates: Changes in Methadone Treatment Practices," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 40-52, March.
    17. Katrina Jessoe & David Rapson, 2014. "Knowledge Is (Less) Power: Experimental Evidence from Residential Energy Use," American Economic Review, American Economic Association, vol. 104(4), pages 1417-1438, April.
    18. Portnoy S., 2003. "Censored Regression Quantiles," Journal of the American Statistical Association, American Statistical Association, vol. 98, pages 1001-1012, January.
    19. Paul L. Joskow, 2012. "Creating a Smarter U.S. Electricity Grid," Journal of Economic Perspectives, American Economic Association, vol. 26(1), pages 29-48, Winter.
    20. Koichiro Ito, 2014. "Do Consumers Respond to Marginal or Average Price? Evidence from Nonlinear Electricity Pricing," American Economic Review, American Economic Association, vol. 104(2), pages 537-563, February.
    21. Keisuke Hirano & Guido W. Imbens & Geert Ridder & Donald B. Rubin, 2001. "Combining Panel Data Sets with Attrition and Refreshment Samples," Econometrica, Econometric Society, vol. 69(6), pages 1645-1659, November.
    22. Matthew Harding & Steven Sexton, 2017. "Household Response to Time-Varying Electricity Prices," Annual Review of Economics, Annual Reviews, vol. 9(1), pages 337-359, October.
    23. Harding, Matthew & Lamarche, Carlos, 2014. "Estimating and testing a quantile regression model with interactive effects," Journal of Econometrics, Elsevier, vol. 178(P1), pages 101-113.
    24. Matthew Harding & Steven Sexton, 2017. "Household Response to Time-Varying Electricity Prices," Annual Review of Resource Economics, Annual Reviews, vol. 9(1), pages 337-359, October.
    25. Das, M., 2004. "Simple estimators for nonparametric panel data models with sample attrition," Journal of Econometrics, Elsevier, vol. 120(1), pages 159-180, May.
    26. Ying Wei & Yanyuan Ma & Raymond J. Carroll, 2012. "Multiple imputation in quantile regression," Biometrika, Biometrika Trust, vol. 99(2), pages 423-438.
    27. Hausman, Jerry A & Wise, David A, 1979. "Attrition Bias in Experimental and Panel Data: The Gary Income Maintenance Experiment," Econometrica, Econometric Society, vol. 47(2), pages 455-473, March.
    28. Kato, Kengo & F. Galvao, Antonio & Montes-Rojas, Gabriel V., 2012. "Asymptotics for panel quantile regression models with individual effects," Journal of Econometrics, Elsevier, vol. 170(1), pages 76-91.
    29. Dennis J. Aigner & Jerry A. Hausman, 1980. "Correcting for Truncation Bias in the Analysis of Experiments in Time-of-Day Pricing of Electricity," Bell Journal of Economics, The RAND Corporation, vol. 11(1), pages 131-142, Spring.
    30. Nevo, Aviv, 2003. "Using Weights to Adjust for Sample Selection When Auxiliary Information Is Available," Journal of Business & Economic Statistics, American Statistical Association, vol. 21(1), pages 43-52, January.
    31. Jinyong Hahn & Whitney Newey, 2004. "Jackknife and Analytical Bias Reduction for Nonlinear Panel Models," Econometrica, Econometric Society, vol. 72(4), pages 1295-1319, July.
    32. Wooldridge, Jeffrey M., 2007. "Inverse probability weighted estimation for general missing data problems," Journal of Econometrics, Elsevier, vol. 141(2), pages 1281-1301, December.
    33. Frank A. Wolak, 2011. "Do Residential Customers Respond to Hourly Prices? Evidence from a Dynamic Pricing Experiment," American Economic Review, American Economic Association, vol. 101(3), pages 83-87, May.
    34. Yanlin Tang & Huixia Wang & Xuming He & Zhongyi Zhu, 2012. "An informative subset-based estimator for censored quantile regression," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 21(4), pages 635-655, December.
    35. Victor Chernozhukov & Iván Fernández‐Val & Jinyong Hahn & Whitney Newey, 2013. "Average and Quantile Effects in Nonseparable Panel Models," Econometrica, Econometric Society, vol. 81(2), pages 535-580, March.
    36. Matthew Harding & Carlos Lamarche, 2017. "Penalized Quantile Regression with Semiparametric Correlated Effects: An Application with Heterogeneous Preferences," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 32(2), pages 342-358, March.
    37. Rosen, Adam M., 2012. "Set identification via quantile restrictions in short panels," Journal of Econometrics, Elsevier, vol. 166(1), pages 127-137.
    38. Antonio F. Galvao & Carlos Lamarche & Luiz Renato Lima, 2013. "Estimation of Censored Quantile Regression for Panel Data With Fixed Effects," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(503), pages 1075-1089, September.
    39. Matthew Harding & Carlos Lamarche, 2016. "Empowering Consumers Through Data and Smart Technology: Experimental Evidence on the Consequences of Time‐of‐Use Electricity Pricing Policies," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 35(4), pages 906-931, September.
    40. Xuerong Chen & Alan T. K. Wan & Yong Zhou, 2015. "Efficient Quantile Regression Analysis With Missing Observations," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(510), pages 723-741, June.
    41. Lamarche, Carlos, 2010. "Robust penalized quantile regression estimation for panel data," Journal of Econometrics, Elsevier, vol. 157(2), pages 396-408, August.
    42. Hong H. & Chernozhukov V., 2002. "Three-Step Censored Quantile Regression and Extramarital Affairs," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 872-882, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Babii, Andrii & Ball, Ryan T. & Ghysels, Eric & Striaukas, Jonas, 2023. "Machine learning panel data regressions with heavy-tailed dependent data: Theory and application," Journal of Econometrics, Elsevier, vol. 237(2).
    2. Lamarche, Carlos & Parker, Thomas, 2023. "Wild bootstrap inference for penalized quantile regression for longitudinal data," Journal of Econometrics, Elsevier, vol. 235(2), pages 1799-1826.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Galvao, Antonio F. & Kato, Kengo, 2016. "Smoothed quantile regression for panel data," Journal of Econometrics, Elsevier, vol. 193(1), pages 92-112.
    2. Li, Tong & Oka, Tatsushi, 2015. "Set identification of the censored quantile regression model for short panels with fixed effects," Journal of Econometrics, Elsevier, vol. 188(2), pages 363-377.
    3. Galvao, Antonio F. & Gu, Jiaying & Volgushev, Stanislav, 2020. "On the unbiased asymptotic normality of quantile regression with fixed effects," Journal of Econometrics, Elsevier, vol. 218(1), pages 178-215.
    4. Graham, Bryan S. & Hahn, Jinyong & Poirier, Alexandre & Powell, James L., 2018. "A quantile correlated random coefficients panel data model," Journal of Econometrics, Elsevier, vol. 206(2), pages 305-335.
    5. Matthew Harding & Carlos Lamarche, 2017. "Penalized Quantile Regression with Semiparametric Correlated Effects: An Application with Heterogeneous Preferences," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 32(2), pages 342-358, March.
    6. Manuel Arellano & Stéphane Bonhomme, 2016. "Nonlinear panel data estimation via quantile regressions," Econometrics Journal, Royal Economic Society, vol. 19(3), pages 61-94, October.
    7. Callaway, Brantly & Li, Tong & Oka, Tatsushi, 2018. "Quantile treatment effects in difference in differences models under dependence restrictions and with only two time periods," Journal of Econometrics, Elsevier, vol. 206(2), pages 395-413.
    8. Liang Chen & Yulong Huo, 2019. "A Simple Estimator for Quantile Panel Data Models Using Smoothed Quantile Regressions," Papers 1911.04729, arXiv.org.
    9. Galvao, Antonio F. & Wang, Liang, 2015. "Efficient minimum distance estimator for quantile regression fixed effects panel data," Journal of Multivariate Analysis, Elsevier, vol. 133(C), pages 1-26.
    10. Denis Chetverikov & Bradley Larsen & Christopher Palmer, 2016. "IV Quantile Regression for Group‐Level Treatments, With an Application to the Distributional Effects of Trade," Econometrica, Econometric Society, vol. 84, pages 809-833, March.
    11. Victor Chernozhukov & Ivan Fernandez-Val & Martin Weidner, 2018. "Network and panel quantile effects via distribution regression," CeMMAP working papers CWP21/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    12. Marcel Das & Vera Toepoel & Arthur van Soest, 2011. "Nonparametric Tests of Panel Conditioning and Attrition Bias in Panel Surveys," Sociological Methods & Research, , vol. 40(1), pages 32-56, February.
    13. Sherrilyn Billger & Carlos Lamarche, 2015. "A panel data quantile regression analysis of the immigrant earnings distribution in the United Kingdom and United States," Empirical Economics, Springer, vol. 49(2), pages 705-750, September.
    14. Liang Chen & Juan J. Dolado & Jesús Gonzalo, 2021. "Quantile Factor Models," Econometrica, Econometric Society, vol. 89(2), pages 875-910, March.
    15. Xiao, Zhijie & Xu, Lan, 2019. "What do mean impacts miss? Distributional effects of corporate diversification," Journal of Econometrics, Elsevier, vol. 213(1), pages 92-120.
    16. Lamarche, Carlos & Parker, Thomas, 2023. "Wild bootstrap inference for penalized quantile regression for longitudinal data," Journal of Econometrics, Elsevier, vol. 235(2), pages 1799-1826.
    17. Liang Chen, 2019. "Nonparametric Quantile Regressions for Panel Data Models with Large T," Papers 1911.01824, arXiv.org, revised Sep 2020.
    18. Heng Chen & Marie-Hélène Felt & Kim P. Huynh, 2017. "Retail payment innovations and cash usage: accounting for attrition by using refreshment samples," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(2), pages 503-530, February.
    19. Battagliola, Maria Laura & Sørensen, Helle & Tolver, Anders & Staicu, Ana-Maria, 2022. "A bias-adjusted estimator in quantile regression for clustered data," Econometrics and Statistics, Elsevier, vol. 23(C), pages 165-186.
    20. Harding, Matthew & Lamarche, Carlos, 2014. "Estimating and testing a quantile regression model with interactive effects," Journal of Econometrics, Elsevier, vol. 178(P1), pages 101-113.

    More about this item

    Keywords

    Attrition; Big Data; Quantile regression; Individual effects; Time-of-Day pricing;
    All these keywords.

    JEL classification:

    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models
    • C23 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Models with Panel Data; Spatio-temporal Models
    • C25 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Discrete Regression and Qualitative Choice Models; Discrete Regressors; Proportions; Probabilities
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:econom:v:211:y:2019:i:1:p:61-82. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/jeconom .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.