IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/23673.html
   My bibliography  Save this paper

Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data

Author

Listed:
  • Serena Ng

Abstract

This paper seeks to better understand what makes big data analysis different, what we can and cannot do with existing econometric tools, and what issues need to be dealt with in order to work with the data efficiently. As a case study, I set out to extract any business cycle information that might exist in four terabytes of weekly scanner data. The main challenge is to handle the volume, variety, and characteristics of the data within the constraints of our computing environment. Scalable and efficient algorithms are available to ease the computation burden, but they often have unknown statistical properties and are not designed for the purpose of efficient estimation or optimal inference. As well, economic data have unique characteristics that generic algorithms may not accommodate. There is a need for computationally efficient econometric methods as big data is likely here to stay.

Suggested Citation

  • Serena Ng, 2017. "Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data," NBER Working Papers 23673, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:23673 Note: TWP
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w23673.pdf
    Download Restriction: Access to the full text is generally limited to series subscribers, however if the top level domain of the client browser is in a developing country or transition economy free access is provided. More information about subscriptions and free access is available at http://www.nber.org/wwphelp.html. Free access is also available to older working papers.

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jessie Handbury & Tsutomu Watanabe & David E. Weinstein, 2013. "How Much Do Official Price Indexes Tell Us about Inflation?," NBER Working Papers 19504, National Bureau of Economic Research, Inc.
    2. Jonathan H. Wright, 2013. "Unseasonal Seasonals?," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 47(2 (Fall)), pages 65-126.
    3. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    4. Harvey, Andrew & Koopman, Siem Jan & Riani, Marco, 1997. "The Modeling and Seasonal Adjustment of Weekly Observations," Journal of Business & Economic Statistics, American Statistical Association, vol. 15(3), pages 354-368, July.
    5. Hyunyoung Choi & Hal Varian, 2012. "Predicting the Present with Google Trends," The Economic Record, The Economic Society of Australia, vol. 88(s1), pages 2-9, June.
    6. Pierce, David A & Grupe, Michael R & Cleveland, William P, 1984. "Seasonal Adjustment of the Weekly Monetary Aggregates: A Model-based Approach," Journal of Business & Economic Statistics, American Statistical Association, vol. 2(3), pages 260-270, July.
    7. Christian Broda & Ephraim Leibtag & David E. Weinstein, 2009. "The Role of Prices in Measuring the Poor's Living Standards," Journal of Economic Perspectives, American Economic Association, vol. 23(2), pages 77-97, Spring.
    8. Jonathan H. Wright, 2013. "Unseasonal Seasonals?," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 44(2 (Fall)), pages 65-126.
    9. Dolan Antenucci & Michael Cafarella & Margaret Levenstein & Christopher RĂ© & Matthew D. Shapiro, 2014. "Using Social Media to Measure Labor Market Flows," NBER Working Papers 20010, National Bureau of Economic Research, Inc.
    10. Judith A. Chevalier & Anil K. Kashyap & Peter E. Rossi, 2003. "Why Don't Prices Rise During Periods of Peak Demand? Evidence from Scanner Data," American Economic Review, American Economic Association, vol. 93(1), pages 15-37, March.
    11. Athey, Susan & Imbens, Guido W., 2015. "Machine Learning for Estimating Heterogeneous Causal Effects," Research Papers 3350, Stanford University, Graduate School of Business.
    12. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    13. Olivier Coibion & Yuriy Gorodnichenko & Gee Hee Hong, 2015. "The Cyclicality of Sales, Regular and Effective Prices: Business Cycle and Policy Implications," American Economic Review, American Economic Association, vol. 105(3), pages 993-1029, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    More about this item

    JEL classification:

    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:23673. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (). General contact details of provider: http://edirc.repec.org/data/nberrus.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.