IDEAS home Printed from https://ideas.repec.org/a/cje/issued/v51y2018i3p695-746.html
   My bibliography  Save this article

Big data analytics in economics: What have we learned so far, and where should we go from here?

Author

Listed:
  • Norman R. Swanson
  • Weiqi Xiong

Abstract

Research into predictive accuracy testing remains at the forefront of the forecasting field. One reason for this is that rankings of predictive accuracy across alternative models, which under misspecification are loss function dependent, are universally utilized to assess the usefulness of econometric models. A second reason, which corresponds to the objective of this paper, is that researchers are currently focusing considerable attention on so-called big data and on new (and old) tools that are available for the analysis of this data. One of the objectives in this field is the assessment of whether big data leads to improvement in forecast accuracy. In this survey paper, we discuss some of the latest (and most interesting) methods currently available for analyzing and utilizing big data when the objective is improved prediction. Our discussion includes a summary of various so-called dimension reduction, shrinkage and machine learning methods as well as a summary of recent tools that are useful for ranking prediction models associated with the implementation of these methods. We also provide a brief empirical illustration of big data in action, in which we show that big data are indeed useful when predicting the term structure of interest rates.

Suggested Citation

  • Norman R. Swanson & Weiqi Xiong, 2018. "Big data analytics in economics: What have we learned so far, and where should we go from here?," Canadian Journal of Economics, Canadian Economics Association, vol. 51(3), pages 695-746, August.
  • Handle: RePEc:cje:issued:v:51:y:2018:i:3:p:695-746
    DOI: 10.1111/caje.12336
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/caje.12336
    Download Restriction: access restricted to subscribers

    File URL: https://libkey.io/10.1111/caje.12336?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Banerjee, Anindya & Marcellino, Massimiliano & Masten, Igor, 2014. "Forecasting with factor-augmented error correction models," International Journal of Forecasting, Elsevier, vol. 30(3), pages 589-612.
    2. Forni, Mario & Hallin, Marc & Lippi, Marco & Reichlin, Lucrezia, 2005. "The Generalized Dynamic Factor Model: One-Sided Estimation and Forecasting," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 830-840, September.
    3. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    4. Marine Carrasco & Barbara Rossi, 2016. "In-Sample Inference and Forecasting in Misspecified Factor Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(3), pages 313-338, July.
    5. Forni, Mario & Lippi, Marco, 2001. "The Generalized Dynamic Factor Model: Representation Theory," Econometric Theory, Cambridge University Press, vol. 17(6), pages 1113-1141, December.
    6. Raffaella Giacomini & Halbert White, 2006. "Tests of Conditional Predictive Ability," Econometrica, Econometric Society, vol. 74(6), pages 1545-1578, November.
    7. Christensen, Jens H.E. & Diebold, Francis X. & Rudebusch, Glenn D., 2011. "The affine arbitrage-free class of Nelson-Siegel term structure models," Journal of Econometrics, Elsevier, vol. 164(1), pages 4-20, September.
    8. Gonçalves, Sílvia & Perron, Benoit, 2014. "Bootstrapping factor-augmented regression models," Journal of Econometrics, Elsevier, vol. 182(1), pages 156-173.
    9. Clark, Todd E. & McCracken, Michael W., 2001. "Tests of equal forecast accuracy and encompassing for nested models," Journal of Econometrics, Elsevier, vol. 105(1), pages 85-110, November.
    10. Raffaella Giacomini & Barbara Rossi, 2009. "Detecting and Predicting Forecast Breakdowns," Review of Economic Studies, Oxford University Press, vol. 76(2), pages 669-705.
    11. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    12. Christian Schumacher, 2007. "Forecasting German GDP using alternative factor models based on large datasets," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 26(4), pages 271-302.
    13. Laura Coroneo & Domenico Giannone & Michele Modugno, 2016. "Unspanned Macroeconomic Factors in the Yield Curve," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(3), pages 472-485, July.
    14. West, Kenneth D, 1996. "Asymptotic Inference about Predictive Ability," Econometrica, Econometric Society, vol. 64(5), pages 1067-1084, September.
    15. Diebold, Francis X. & Li, Canlin, 2006. "Forecasting the term structure of government bond yields," Journal of Econometrics, Elsevier, vol. 130(2), pages 337-364, February.
    16. Breitung, Jörg & Eickmeier, Sandra, 2011. "Testing for structural breaks in dynamic factor models," Journal of Econometrics, Elsevier, vol. 163(1), pages 71-84, July.
    17. Andrews,Donald W. K. & Stock,James H. (ed.), 2005. "Identification and Inference for Econometric Models," Cambridge Books, Cambridge University Press, number 9780521844413.
    18. Diebold, Francis X & Mariano, Roberto S, 2002. "Comparing Predictive Accuracy," Journal of Business & Economic Statistics, American Statistical Association, vol. 20(1), pages 134-144, January.
    19. Clive W.J. Granger, 1999. "Outline of forecast theory using generalized cost functions," Spanish Economic Review, Springer;Spanish Economic Association, vol. 1(2), pages 161-173.
    20. Schumacher, Christian, 2010. "Factor forecasting using international targeted predictors: The case of German GDP," Economics Letters, Elsevier, vol. 107(2), pages 95-98, May.
    21. Forni, Mario & Hallin, Marc & Lippi, Marco & Zaffaroni, Paolo, 2015. "Dynamic factor models with infinite-dimensional factor spaces: One-sided representations," Journal of Econometrics, Elsevier, vol. 185(2), pages 359-371.
    22. Chen, Liang & Dolado, Juan J. & Gonzalo, Jesús, 2014. "Detecting big structural breaks in large factor models," Journal of Econometrics, Elsevier, vol. 180(1), pages 30-48.
    23. Michael P. Clements & David F.Hendry, 2001. "Forecasting with difference-stationary and trend-stationary models," Econometrics Journal, Royal Economic Society, vol. 4(1), pages 1-19.
    24. Anindya Banerjee & Massimiliano Marcellino & Igor Masten, 2008. "Forecasting Macroeconomic Variables Using Diffusion Indexes in Short Samples with Structural Change," Working Papers 334, IGIER (Innocenzo Gasparini Institute for Economic Research), Bocconi University.
    25. Bai, Jushan & Ng, Serena, 2008. "Forecasting economic time series using targeted predictors," Journal of Econometrics, Elsevier, vol. 146(2), pages 304-317, October.
    26. Michael P. Clements & David F. Hendry, 1999. "On winning forecasting competitions in economics," Spanish Economic Review, Springer;Spanish Economic Association, vol. 1(2), pages 123-160.
    27. West, Kenneth D & McCracken, Michael W, 1998. "Regression-Based Tests of Predictive Ability," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 39(4), pages 817-840, November.
    28. Onatski, Alexei, 2015. "Asymptotic analysis of the squared estimation error in misspecified factor models," Journal of Econometrics, Elsevier, vol. 186(2), pages 388-406.
    29. Christoffersen, Peter F, 1998. "Evaluating Interval Forecasts," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 39(4), pages 841-862, November.
    30. Clements, Michael P. & Hendry, David F. (ed.), 2011. "The Oxford Handbook of Economic Forecasting," OUP Catalogue, Oxford University Press, number 9780195398649, Decembrie.
    31. Diebold, Francis X. & Shin, Minchul, 2015. "Assessing point forecast accuracy by stochastic loss distance," Economics Letters, Elsevier, vol. 130(C), pages 37-38.
    32. Barbara Rossi & Atsushi Inoue, 2012. "Out-of-Sample Forecast Tests Robust to the Choice of Window Size," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 30(3), pages 432-453, April.
    33. Cheng, Xu & Hansen, Bruce E., 2015. "Forecasting with factor-augmented regression: A frequentist model averaging approach," Journal of Econometrics, Elsevier, vol. 186(2), pages 280-293.
    34. Corradi, Valentina & Swanson, Norman R., 2014. "Testing for structural stability of factor augmented forecasting models," Journal of Econometrics, Elsevier, vol. 182(1), pages 100-118.
    35. Elliott, Graham & Timmermann, Allan, 2004. "Optimal forecast combinations under general loss functions and forecast error distributions," Journal of Econometrics, Elsevier, vol. 122(1), pages 47-79, September.
    36. Keisuke Hirano & Jonathan H. Wright, 2017. "Forecasting With Model Uncertainty: Representations and Risk Reduction," Econometrica, Econometric Society, vol. 85, pages 617-643, March.
    37. Jean Boivin & Serena Ng, 2005. "Understanding and Comparing Factor-Based Forecasts," International Journal of Central Banking, International Journal of Central Banking, vol. 1(3), December.
    38. Xiaohong Chen & Norman R. Swanson (ed.), 2013. "Recent Advances and Future Directions in Causality, Prediction, and Specification Analysis," Springer Books, Springer, edition 127, number 978-1-4614-1653-1, June.
    39. James H. Stock & Mark W. Watson, 2012. "Generalized Shrinkage Methods for Forecasting Using Many Predictors," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 30(4), pages 481-493, June.
    40. Chen, Yin-Ping & Huang, Hsin-Cheng & Tu, I-Ping, 2010. "A new approach for selecting the number of factors," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 2990-2998, December.
    41. Jushan Bai, 2003. "Inferential Theory for Factor Models of Large Dimensions," Econometrica, Econometric Society, vol. 71(1), pages 135-171, January.
    42. Goncalves, Silvia & White, Halbert, 2004. "Maximum likelihood and the bootstrap for nonlinear dynamic models," Journal of Econometrics, Elsevier, vol. 119(1), pages 199-219, March.
    43. Mario Forni & Marc Hallin & Marco Lippi & Lucrezia Reichlin, 2000. "The Generalized Dynamic-Factor Model: Identification And Estimation," The Review of Economics and Statistics, MIT Press, vol. 82(4), pages 540-554, November.
    44. Chao, John & Corradi, Valentina & Swanson, Norman R., 2001. "Out-Of-Sample Tests For Granger Causality," Macroeconomic Dynamics, Cambridge University Press, vol. 5(4), pages 598-620, September.
    45. Corradi, Valentina & Swanson, Norman R., 2002. "A consistent test for nonlinear out of sample predictive accuracy," Journal of Econometrics, Elsevier, vol. 110(2), pages 353-381, October.
    46. Ang, Andrew & Piazzesi, Monika, 2003. "A no-arbitrage vector autoregression of term structure dynamics with macroeconomic and latent variables," Journal of Monetary Economics, Elsevier, vol. 50(4), pages 745-787, May.
    47. Bai, Jushan & Ng, Serena, 2007. "Determining the Number of Primitive Shocks in Factor Models," Journal of Business & Economic Statistics, American Statistical Association, vol. 25, pages 52-60, January.
    48. Corradi, Valentina & Swanson, Norman R., 2005. "A Test For Comparing Multiple Misspecified Conditional Interval Models," Econometric Theory, Cambridge University Press, vol. 21(5), pages 991-1016, October.
    49. Kim, Hyun Hak & Swanson, Norman R., 2014. "Forecasting financial and macroeconomic variables using data reduction methods: New empirical evidence," Journal of Econometrics, Elsevier, vol. 178(P2), pages 352-367.
    50. Valentina Corradi & Norman R. Swanson, 2007. "Nonparametric Bootstrap Procedures For Predictive Inference Based On Recursive Estimation Schemes," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 48(1), pages 67-109, February.
    51. Jean-Marie Dufour & Dalibor Stevanović, 2013. "Factor-Augmented VARMA Models With Macroeconomic Applications," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 31(4), pages 491-506, October.
    52. Jin, Sainan & Corradi, Valentina & Swanson, Norman R., 2017. "Robust Forecast Comparison," Econometric Theory, Cambridge University Press, vol. 33(6), pages 1306-1351, December.
    53. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    54. Jushan Bai & Serena Ng, 2009. "Boosting diffusion indices," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 24(4), pages 607-629.
    55. G. Elliott & C. Granger & A. Timmermann (ed.), 2013. "Handbook of Economic Forecasting," Handbook of Economic Forecasting, Elsevier, edition 1, volume 2, number 2.
    56. Corradi, Valentina & Swanson, Norman R. & Olivetti, Claudia, 2001. "Predictive ability with cointegrated variables," Journal of Econometrics, Elsevier, vol. 104(2), pages 315-358, September.
    57. Nelson, Charles R & Siegel, Andrew F, 1987. "Parsimonious Modeling of Yield Curves," The Journal of Business, University of Chicago Press, vol. 60(4), pages 473-489, October.
    58. Valentina Corradi & Norman Swanson, 2013. "A Survey of Recent Advances in Forecast Accuracy Comparison Testing, with an Extension to Stochastic Dominance," Departmental Working Papers 201309, Rutgers University, Department of Economics.
    59. G. Elliott & C. Granger & A. Timmermann (ed.), 2006. "Handbook of Economic Forecasting," Handbook of Economic Forecasting, Elsevier, edition 1, volume 1, number 1.
    60. Diebold, Francis X. & Rudebusch, Glenn D. & Borag[caron]an Aruoba, S., 2006. "The macroeconomy and the yield curve: a dynamic latent factor approach," Journal of Econometrics, Elsevier, vol. 131(1-2), pages 309-338.
    61. Halbert White, 2000. "A Reality Check for Data Snooping," Econometrica, Econometric Society, vol. 68(5), pages 1097-1126, September.
    62. Mark W. Watson & James H. Stock, 2004. "Combination forecasts of output growth in a seven-country data set," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 23(6), pages 405-430.
    63. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    64. Kim, Hyun Hak & Swanson, Norman R., 2018. "Mining big data using parsimonious factor, machine learning, variable selection and shrinkage methods," International Journal of Forecasting, Elsevier, vol. 34(2), pages 339-354.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Castle, Jennifer L. & Doornik, Jurgen A. & Hendry, David F., 2021. "Modelling non-stationary ‘Big Data’," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1556-1575.
    2. Anghel, Dan Gabriel, 2021. "Data Snooping Bias in Tests of the Relative Performance of Multiple Forecasting Models," Journal of Banking & Finance, Elsevier, vol. 126(C).
    3. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    4. Cheng, Mingmian & Swanson, Norman R. & Yang, Xiye, 2021. "Forecasting volatility using double shrinkage methods," Journal of Empirical Finance, Elsevier, vol. 62(C), pages 46-61.
    5. Philip ME Garboden, 2019. "Sources and Types of Big Data for Macroeconomic Forecasting," Working Papers 2019-3, University of Hawaii Economic Research Organization, University of Hawaii at Manoa.
    6. Norman R. Swanson & Weiqi Xiong & Xiye Yang, 2020. "Predicting interest rates using shrinkage methods, real‐time diffusion indexes, and model combinations," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 35(5), pages 587-613, August.
    7. Yoshiki Nakajima & Naoya Sueishi, 2022. "Forecasting the Japanese macroeconomy using high-dimensional data," The Japanese Economic Review, Springer, vol. 73(2), pages 299-324, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Norman R. Swanson & Weiqi Xiong & Xiye Yang, 2020. "Predicting interest rates using shrinkage methods, real‐time diffusion indexes, and model combinations," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 35(5), pages 587-613, August.
    2. Barbara Rossi, 2019. "Forecasting in the presence of instabilities: How do we know whether models predict well and how to improve them," Economics Working Papers 1711, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 2021.
    3. Jin, Sainan & Corradi, Valentina & Swanson, Norman R., 2017. "Robust Forecast Comparison," Econometric Theory, Cambridge University Press, vol. 33(6), pages 1306-1351, December.
    4. Kim, Hyun Hak & Swanson, Norman R., 2018. "Mining big data using parsimonious factor, machine learning, variable selection and shrinkage methods," International Journal of Forecasting, Elsevier, vol. 34(2), pages 339-354.
    5. Kim, Hyun Hak & Swanson, Norman R., 2014. "Forecasting financial and macroeconomic variables using data reduction methods: New empirical evidence," Journal of Econometrics, Elsevier, vol. 178(P2), pages 352-367.
    6. Corradi, Valentina & Swanson, Norman R., 2014. "Testing for structural stability of factor augmented forecasting models," Journal of Econometrics, Elsevier, vol. 182(1), pages 100-118.
    7. Cepni, Oguzhan & Güney, I. Ethem & Swanson, Norman R., 2019. "Nowcasting and forecasting GDP in emerging markets using global financial and macroeconomic diffusion indexes," International Journal of Forecasting, Elsevier, vol. 35(2), pages 555-572.
    8. Hyun Hak Kim & Norman Swanson, 2013. "Mining Big Data Using Parsimonious Factor and Shrinkage Methods," Departmental Working Papers 201316, Rutgers University, Department of Economics.
    9. Marine Carrasco & Barbara Rossi, 2016. "In-Sample Inference and Forecasting in Misspecified Factor Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(3), pages 313-338, July.
    10. Jack Fosten, 2016. "Forecast evaluation with factor-augmented models," University of East Anglia School of Economics Working Paper Series 2016-05, School of Economics, University of East Anglia, Norwich, UK..
    11. Rachidi Kotchoni & Maxime Leroux & Dalibor Stevanovic, 2019. "Macroeconomic forecast accuracy in a data‐rich environment," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 34(7), pages 1050-1072, November.
    12. Kihwan Kim & Hyun Hak Kim & Norman R. Swanson, 2023. "Mixing mixed frequency and diffusion indices in good times and in bad: an assessment based on historical data around the great recession of 2008," Empirical Economics, Springer, vol. 64(3), pages 1421-1469, March.
    13. Cheng, Xu & Hansen, Bruce E., 2015. "Forecasting with factor-augmented regression: A frequentist model averaging approach," Journal of Econometrics, Elsevier, vol. 186(2), pages 280-293.
    14. Stock, J.H. & Watson, M.W., 2016. "Dynamic Factor Models, Factor-Augmented Vector Autoregressions, and Structural Vector Autoregressions in Macroeconomics," Handbook of Macroeconomics, in: J. B. Taylor & Harald Uhlig (ed.), Handbook of Macroeconomics, edition 1, volume 2, chapter 0, pages 415-525, Elsevier.
    15. Luciani, Matteo, 2014. "Forecasting with approximate dynamic factor models: The role of non-pervasive shocks," International Journal of Forecasting, Elsevier, vol. 30(1), pages 20-29.
    16. Inske Pirschel & Maik H. Wolters, 2018. "Forecasting with large datasets: compressing information before, during or after the estimation?," Empirical Economics, Springer, vol. 55(2), pages 573-596, September.
    17. Raffaella Giacomini & Barbara Rossi, 2013. "Forecasting in macroeconomics," Chapters, in: Nigar Hashimzade & Michael A. Thornton (ed.), Handbook of Research Methods and Applications in Empirical Macroeconomics, chapter 17, pages 381-408, Edward Elgar Publishing.
    18. Mogliani, Matteo & Simoni, Anna, 2021. "Bayesian MIDAS penalized regressions: Estimation, selection, and prediction," Journal of Econometrics, Elsevier, vol. 222(1), pages 833-860.
    19. Corradi, Valentina & Swanson, Norman R., 2006. "Bootstrap conditional distribution tests in the presence of dynamic misspecification," Journal of Econometrics, Elsevier, vol. 133(2), pages 779-806, August.
    20. Hyun Hak Kim, 2013. "Forecasting Macroeconomic Variables Using Data Dimension Reduction Methods: The Case of Korea," Working Papers 2013-26, Economic Research Institute, Bank of Korea.

    More about this item

    JEL classification:

    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cje:issued:v:51:y:2018:i:3:p:695-746. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Prof. Werner Antweiler (email available below). General contact details of provider: https://edirc.repec.org/data/ceaaaea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.