IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2007.07781.html
   My bibliography  Save this paper

Least Squares Estimation Using Sketched Data with Heteroskedastic Errors

Author

Listed:
  • Sokbae Lee
  • Serena Ng

Abstract

Researchers may perform regressions using a sketch of data of size $m$ instead of the full sample of size $n$ for a variety of reasons. This paper considers the case when the regression errors do not have constant variance and heteroskedasticity robust standard errors would normally be needed for test statistics to provide accurate inference. We show that estimates using data sketched by random projections will behave `as if' the errors were homoskedastic. Estimation by random sampling would not have this property. The result arises because the sketched estimates in the case of random projections can be expressed as degenerate $U$-statistics, and under certain conditions, these statistics are asymptotically normal with homoskedastic variance. We verify that the conditions hold not only in the case of least squares regression when the covariates are exogenous, but also in instrumental variables estimation when the covariates are endogenous. The result implies that inference, including first-stage F tests for instrument relevance, can be simpler than the full sample case if the sketching scheme is appropriately chosen.

Suggested Citation

  • Sokbae Lee & Serena Ng, 2020. "Least Squares Estimation Using Sketched Data with Heteroskedastic Errors," Papers 2007.07781, arXiv.org, revised Jun 2022.
  • Handle: RePEc:arx:papers:2007.07781
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2007.07781
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sokbae Lee & Serena Ng, 2020. "An Econometric Perspective on Algorithmic Subsampling," Annual Review of Economics, Annual Reviews, vol. 12(1), pages 45-80, August.
    2. Joshua D. Angrist & Alan B. Krueger, 1993. "Split Sample Instrumental Variables," Working Papers 699, Princeton University, Department of Economics, Industrial Relations Section..
    3. Atsushi Inoue & Gary Solon, 2010. "Two-Sample Instrumental Variables Estimators," The Review of Economics and Statistics, MIT Press, vol. 92(3), pages 557-561, August.
    4. Angrist, Joshua D & Krueger, Alan B, 1995. "Split-Sample Instrumental Variables Estimates of the Return to Schooling," Journal of Business & Economic Statistics, American Statistical Association, vol. 13(2), pages 225-235, April.
    5. Joshua D. Angrist & Alan B. Keueger, 1991. "Does Compulsory School Attendance Affect Schooling and Earnings?," The Quarterly Journal of Economics, Oxford University Press, vol. 106(4), pages 979-1014.
    6. Hall, Peter, 1984. "Central limit theorem for integrated square error of multivariate nonparametric density estimators," Journal of Multivariate Analysis, Elsevier, vol. 14(1), pages 1-16, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Gupta, Joyeeta & Bavinck, Maarten & Ros-Tonen, Mirjam & Asubonteng, Kwabena & Bosch, Hilmer & van Ewijk, Edith & Hordijk, Michaela & Van Leynseele, Yves & Lopes Cardozo, Mieke & Miedema, Esther & Pouw, 2021. "COVID-19, poverty and inclusive development," World Development, Elsevier, vol. 145(C).
    2. Harold D Chiang & Yuya Sasaki, 2023. "On Using The Two-Way Cluster-Robust Standard Errors," Papers 2301.13775, arXiv.org.
    3. Sokbae Lee & Yuan Liao & Myung Hwan Seo & Youngki Shin, 2022. "Fast Inference for Quantile Regression with Tens of Millions of Observations," Papers 2209.14502, arXiv.org, revised Oct 2023.
    4. Hensher, David A., 2021. "The case for negotiated contracts under the transition to a green bus fleet," Transportation Research Part A: Policy and Practice, Elsevier, vol. 154(C), pages 255-269.
    5. Günther, Jutta (Ed.) & Wedemeier, Jan (Ed.), 2020. "Struktureller Umbruch durch COVID-19: Implikationen für die Innovationspolitik im Land Bremen," HWWI Policy Papers 128, Hamburg Institute of International Economics (HWWI).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cantarella, Michele & Strozzi, Chiara, 2019. "Workers in the Crowd: The Labour Market Impact of the Online Platform Economy," IZA Discussion Papers 12327, Institute of Labor Economics (IZA).
    2. Michele Cantarella & Chiara Strozzi, 2018. "Labour market effects of crowdwork in US and EU: an empirical investigation," Department of Economics 0139, University of Modena and Reggio E., Faculty of Economics "Marco Biagi".
    3. Michele Cantarella & Chiara Strozzi, 2018. "Labour market effects of crowdwork in the US and EU: an empirical investigation," Center for Economic Research (RECent) 140, University of Modena and Reggio E., Dept. of Economics "Marco Biagi".
    4. Michele Cantarella & Chiara Strozzi, 2021. "Workers in the crowd: the labor market impact of the online platform economy [An evaluation of instrumental variable strategies for estimating the effects of catholic schooling]," Industrial and Corporate Change, Oxford University Press and the Associazione ICC, vol. 30(6), pages 1429-1458.
    5. Isaiah Andrews & Timothy B. Armstrong, 2017. "Unbiased instrumental variables estimation under known first‐stage sign," Quantitative Economics, Econometric Society, vol. 8(2), pages 479-503, July.
    6. Bernheim, B. Douglas & Garrett, Daniel M. & Maki, Dean M., 2001. "Education and saving:: The long-term effects of high school financial curriculum mandates," Journal of Public Economics, Elsevier, vol. 80(3), pages 435-465, June.
    7. Rietveld, Cornelius A. & Webbink, Dinand, 2016. "On the genetic bias of the quarter of birth instrument," Economics & Human Biology, Elsevier, vol. 21(C), pages 137-146.
    8. Shakil, Golam Saroare & Marsh, Thomas L., 2021. "One Instrument to Rule Them All?," 2021 Annual Meeting, August 1-3, Austin, Texas 314047, Agricultural and Applied Economics Association.
    9. Henry S Farber & Daniel Herbst & Ilyana Kuziemko & Suresh Naidu, 2021. "Unions and Inequality over the Twentieth Century: New Evidence from Survey Data," The Quarterly Journal of Economics, Oxford University Press, vol. 136(3), pages 1325-1385.
    10. Lefranc, Arnaud, 2018. "Intergenerational Earnings Persistence and Economic Inequality in the Long-Run: Evidence from French Cohorts, 1931-1975," IZA Discussion Papers 11406, Institute of Labor Economics (IZA).
    11. Lei, Wang & Li, Mengjie & Zhang, Siqi & Sun, Yonglei & Sylvia, Sean & Yang, Enyan & Ma, Guangrong & Zhang, Linxiu & Mo, Di & Rozelle, Scott, 2018. "Contract teachers and student achievement in rural China: evidence from class fixed effects," Australian Journal of Agricultural and Resource Economics, Australian Agricultural and Resource Economics Society, vol. 62(2), April.
    12. Julie L. Hotchkiss & Anil Rupasingha & Thor Watson, 2022. "In-migration and Dilution of Community Social Capital," International Regional Science Review, , vol. 45(1), pages 36-57, January.
    13. Prokhorov, Artem & Schmidt, Peter, 2009. "GMM redundancy results for general missing data problems," Journal of Econometrics, Elsevier, vol. 151(1), pages 47-55, July.
    14. Joshua D. Angrist & Alan B. Krueger, 2001. "Instrumental Variables and the Search for Identification: From Supply and Demand to Natural Experiments," Journal of Economic Perspectives, American Economic Association, vol. 15(4), pages 69-85, Fall.
    15. Paul A. Bekker & Jan van der Ploeg, 2000. "Instrumental Variable Estimation Based on Grouped Data," Econometric Society World Congress 2000 Contributed Papers 1862, Econometric Society.
    16. Pacini, David & Windmeijer, Frank, 2016. "Robust inference for the Two-Sample 2SLS estimator," Economics Letters, Elsevier, vol. 146(C), pages 50-54.
    17. Angrist, J D & Imbens, G W & Krueger, A B, 1999. "Jackknife Instrumental Variables Estimation," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 14(1), pages 57-67, Jan.-Feb..
    18. Faqin Lin & Can Huang & Xiaobo He & Chao Zhang, 2013. "Do more highly educated entrepreneurs matter?," Asian-Pacific Economic Literature, Asia Pacific School of Economics and Government, The Australian National University, vol. 27(2), pages 104-116, November.
    19. Thomas Crossley & Peter Levell & Hamish Low, 2020. "House Price Rises and Borrowing to Invest," IFS Working Papers W20/2, Institute for Fiscal Studies.
    20. Jean-Marie Dufour & Mohamed Taamouti, 2005. "Projection-Based Statistical Inference in Linear Structural Models with Possibly Weak Instruments," Econometrica, Econometric Society, vol. 73(4), pages 1351-1365, July.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2007.07781. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.