IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/24334.html
   My bibliography  Save this paper

The Impact of Big Data on Firm Performance: An Empirical Investigation

Author

Listed:
  • Patrick Bajari
  • Victor Chernozhukov
  • Ali Hortaçsu
  • Junichi Suzuki

Abstract

In academic and policy circles, there has been considerable interest in the impact of “big data” on firm performance. We examine the question of how the amount of data impacts the accuracy of Machine Learned models of weekly retail product forecasts using a proprietary data set obtained from Amazon. We examine the accuracy of forecasts in two relevant dimensions: the number of products (N), and the number of time periods for which a product is available for sale (T). Theory suggests diminishing returns to larger N and T, with relative forecast errors diminishing at rate 1/√N+1/√T. Empirical results indicate gains in forecast improvement in the T dimension; as more and more data is available for a particular product, demand forecasts for that product improve over time, though with diminishing returns to scale. In contrast, we find an essentially flat N effect across the various lines of merchandise: with a few exceptions, expansion in the number of retail products within a category does not appear associated with increases in forecast performance. We do find that the firm’s overall forecast performance, controlling for N and T effects across product lines, has improved over time, suggesting gradual improvements in forecasting from the introduction of new models and improved technology.

Suggested Citation

  • Patrick Bajari & Victor Chernozhukov & Ali Hortaçsu & Junichi Suzuki, 2018. "The Impact of Big Data on Firm Performance: An Empirical Investigation," NBER Working Papers 24334, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:24334
    Note: IO
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w24334.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    2. Ben S. Bernanke & Jean Boivin & Piotr Eliasz, 2005. "Measuring the Effects of Monetary Policy: A Factor-Augmented Vector Autoregressive (FAVAR) Approach," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 120(1), pages 387-422.
    3. Patrick Bajari & Victor Chernozhukov & Ali Hortaçsu & Junichi Suzuki, 2019. "The Impact of Big Data on Firm Performance: An Empirical Investigation," AEA Papers and Proceedings, American Economic Association, vol. 109, pages 33-37, May.
    4. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    5. Jushan Bai, 2009. "Panel Data Models With Interactive Fixed Effects," Econometrica, Econometric Society, vol. 77(4), pages 1229-1279, July.
    6. Timothy F. Bresnahan & Erik Brynjolfsson & Lorin M. Hitt, 2002. "Information Technology, Workplace Organization, and the Demand for Skilled Labor: Firm-Level Evidence," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 117(1), pages 339-376.
    7. Prasanna Tambe & Lorin M. Hitt, 2012. "The Productivity of Information Technology Investments: New Evidence from IT Labor Data," Information Systems Research, INFORMS, vol. 23(3-part-1), pages 599-617, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chen, Liang, 2012. "Identifying observed factors in approximate factor models: estimation and hypothesis testing," MPRA Paper 37514, University Library of Munich, Germany.
    2. Greenaway-McGrevy, Ryan & Han, Chirok & Sul, Donggyu, 2012. "Asymptotic distribution of factor augmented estimators for panel regression," Journal of Econometrics, Elsevier, vol. 169(1), pages 48-53.
    3. Chou, Ray Yeutien & Yen, Tso-Jung & Yen, Yu-Min, 2017. "Risk evaluations with robust approximate factor models," Journal of Banking & Finance, Elsevier, vol. 82(C), pages 244-264.
    4. Moon, Hyungsik Roger & Weidner, Martin, 2017. "Dynamic Linear Panel Regression Models With Interactive Fixed Effects," Econometric Theory, Cambridge University Press, vol. 33(1), pages 158-195, February.
    5. Castagnetti, Carolina & Rossi, Eduardo & Trapani, Lorenzo, 2019. "A two-stage estimator for heterogeneous panel models with common factors," Econometrics and Statistics, Elsevier, vol. 11(C), pages 63-82.
    6. Smets, Frank & Beyer, Robert C. M., 2015. "Labour market adjustments in Europe and the US: How different?," Working Paper Series 1767, European Central Bank.
    7. Jörg Breitung & In Choi, 2013. "Factor models," Chapters, in: Nigar Hashimzade & Michael A. Thornton (ed.), Handbook of Research Methods and Applications in Empirical Macroeconomics, chapter 11, pages 249-265, Edward Elgar Publishing.
      • In Choi & Jorg Breitung, 2011. "Factor models," Working Papers 1121, Nam Duck-Woo Economic Research Institute, Sogang University (Former Research Institute for Market Economy), revised Dec 2011.
    8. Hyungsik Roger Roger Moon & Martin Weidner, 2013. "Dynamic linear panel regression models with interactive fixed effects," CeMMAP working papers 63/13, Institute for Fiscal Studies.
    9. Hyungsik Roger Roger Moon & Martin Weidner, 2014. "Dynamic linear panel regression models with interactive fixed effects," CeMMAP working papers 47/14, Institute for Fiscal Studies.
    10. Fan, Jianqing & Ke, Yuan & Liao, Yuan, 2021. "Augmented factor models with applications to validating market risk factors and forecasting bond risk premia," Journal of Econometrics, Elsevier, vol. 222(1), pages 269-294.
    11. Miao, Ke & Phillips, Peter C.B. & Su, Liangjun, 2023. "High-dimensional VARs with common factors," Journal of Econometrics, Elsevier, vol. 233(1), pages 155-183.
    12. Gao, Jiti & Liu, Fei & Peng, Bin & Yan, Yayi, 2023. "Binary response models for heterogeneous panel data with interactive fixed effects," Journal of Econometrics, Elsevier, vol. 235(2), pages 1654-1679.
    13. Matthew Harding & Carlos Lamarche & Chris Muris, 2022. "Estimation of a Factor-Augmented Linear Model with Applications Using Student Achievement Data," Papers 2203.03051, arXiv.org.
    14. Chen, Liang & Dolado, Juan J. & Gonzalo, Jesús, 2014. "Detecting big structural breaks in large factor models," Journal of Econometrics, Elsevier, vol. 180(1), pages 30-48.
    15. Wang, Fa, 2017. "Maximum likelihood estimation and inference for high dimensional nonlinear factor models with application to factor-augmented regressions," MPRA Paper 93484, University Library of Munich, Germany, revised 19 May 2019.
    16. Westerlund, Joakim & Urbain, Jean-Pierre, 2013. "On the implementation and use of factor-augmented regressions in panel data," Journal of Asian Economics, Elsevier, vol. 28(C), pages 3-11.
    17. Erik Brynjolfsson & Wang Jin & Kristina McElheran, 2021. "The power of prediction: predictive analytics, workplace complements, and business performance," Business Economics, Palgrave Macmillan;National Association for Business Economics, vol. 56(4), pages 217-239, October.
    18. Bai, Jushan & Han, Xu & Shi, Yutang, 2020. "Estimation and inference of change points in high-dimensional factor models," Journal of Econometrics, Elsevier, vol. 219(1), pages 66-100.
    19. Wang, Shaoping & Cui, Guowei & Li, Kunpeng, 2015. "Factor-augmented regression models with structural change," Economics Letters, Elsevier, vol. 130(C), pages 124-127.
    20. Juan José Echavarría & Andrés González, 2012. "Choques internacionales reales y financieros y su impacto sobre la economía colombiana," Revista ESPE - Ensayos sobre Política Económica, Banco de la Republica de Colombia, vol. 30(69), pages 14-66, December.

    More about this item

    JEL classification:

    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • L81 - Industrial Organization - - Industry Studies: Services - - - Retail and Wholesale Trade; e-Commerce

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:24334. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/nberrus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.