IDEAS home Printed from https://ideas.repec.org/p/fip/fednsr/847.html
   My bibliography  Save this paper

Economic predictions with big data: the illusion of sparsity

Author

Listed:
  • Domenico Giannone
  • Michele Lenza
  • Giorgio E. Primiceri

Abstract

We compare sparse and dense representations of predictive models in macroeconomics, microeconomics, and finance. To deal with a large number of possible predictors, we specify a prior that allows for both variable selection and shrinkage. The posterior distribution does not typically concentrate on a single sparse or dense model, but on a wide set of models. A clearer pattern of sparsity can only emerge when models of very low dimension are strongly favored a priori.

Suggested Citation

  • Domenico Giannone & Michele Lenza & Giorgio E. Primiceri, 2018. "Economic predictions with big data: the illusion of sparsity," Staff Reports 847, Federal Reserve Bank of New York.
  • Handle: RePEc:fip:fednsr:847
    as

    Download full text from publisher

    File URL: https://www.newyorkfed.org/research/staff_reports/sr847.html
    File Function: Summary
    Download Restriction: no

    File URL: https://www.newyorkfed.org/medialibrary/media/research/staff_reports/sr847.pdf
    File Function: Full text
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Ivo Welch & Amit Goyal, 2008. "A Comprehensive Look at The Empirical Performance of Equity Premium Prediction," Review of Financial Studies, Society for Financial Studies, vol. 21(4), pages 1455-1508, July.
    2. Domenico Giannone & Michele Lenza & Giorgio E. Primiceri, 2021. "Economic Predictions With Big Data: The Illusion of Sparsity," Econometrica, Econometric Society, vol. 89(5), pages 2409-2437, September.
    3. David Rapach & Jack Strauss, 2010. "Bagging or Combining (or Both)? An Analysis Based on Forecasting U.S. Employment Growth," Econometric Reviews, Taylor & Francis Journals, vol. 29(5-6), pages 511-533.
    4. Leamer, Edward E, 1973. "Multicollinearity: A Bayesian Interpretation," The Review of Economics and Statistics, MIT Press, vol. 55(3), pages 371-380, August.
    5. A. Chudik & G. Kapetanios & M. Hashem Pesaran, 2018. "A One Covariate at a Time, Multiple Testing Approach to Variable Selection in High‐Dimensional Linear Regression Models," Econometrica, Econometric Society, vol. 86(4), pages 1479-1512, July.
    6. Carlos M. Carvalho & Nicholas G. Polson & James G. Scott, 2010. "The horseshoe estimator for sparse signals," Biometrika, Biometrika Trust, vol. 97(2), pages 465-480.
    7. Michael W. McCracken & Serena Ng, 2016. "FRED-MD: A Monthly Database for Macroeconomic Research," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 574-589, October.
    8. Inoue, Atsushi & Kilian, Lutz, 2008. "How Useful Is Bagging in Forecasting Economic Time Series? A Case Study of U.S. Consumer Price Inflation," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 511-522, June.
    9. Leeb, Hannes & Potscher, Benedikt M., 2008. "Sparse estimators and the oracle property, or the return of Hodges' estimator," Journal of Econometrics, Elsevier, vol. 142(1), pages 201-211, January.
    10. Kozak, Serhiy & Nagel, Stefan & Santosh, Shrihari, 2020. "Shrinking the cross-section," Journal of Financial Economics, Elsevier, vol. 135(2), pages 271-292.
    11. Serena Ng, 2014. "Viewpoint: Boosting Recessions," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 47(1), pages 1-34, February.
    12. Carmen Fernandez & Eduardo Ley & Mark F. J. Steel, 2001. "Model uncertainty in cross-country growth regressions," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 16(5), pages 563-576.
    13. Bańbura, Marta & Giannone, Domenico & Lenza, Michele, 2015. "Conditional forecasts and scenario analysis with vector autoregressions for large cross-sections," International Journal of Forecasting, Elsevier, vol. 31(3), pages 739-756.
    14. Leeb, Hannes & Pötscher, Benedikt M., 2005. "Model Selection And Inference: Facts And Fiction," Econometric Theory, Cambridge University Press, vol. 21(1), pages 21-59, February.
    15. Robert J. Barro, 1991. "Economic Growth in a Cross Section of Countries," The Quarterly Journal of Economics, Oxford University Press, vol. 106(2), pages 407-443.
    16. Barro, Robert J. & Lee, Jong-Wha, 1994. "Sources of economic growth," Carnegie-Rochester Conference Series on Public Policy, Elsevier, vol. 40(1), pages 1-46, June.
    17. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    18. Ng, Serena, 2013. "Variable Selection in Predictive Regressions," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 752-789, Elsevier.
    19. De Mol, Christine & Giannone, Domenico & Reichlin, Lucrezia, 2006. "Forecasting using a large number of predictors: is Bayesian regression a valid alternative to principal components?," Discussion Paper Series 1: Economic Studies 2006,32, Deutsche Bundesbank.
    20. John J. Donohue III & Steven D. Levitt, 2001. "The Impact of Legalized Abortion on Crime," The Quarterly Journal of Economics, Oxford University Press, vol. 116(2), pages 379-420.
    21. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    22. Jonathan H. Wright, 2009. "Forecasting US inflation by Bayesian model averaging," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 28(2), pages 131-144.
    23. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," Review of Economic Studies, Oxford University Press, vol. 81(2), pages 608-650.
    24. Jon Faust & Simon Gilchrist & Jonathan H. Wright & Egon Zakrajšsek, 2013. "Credit Spreads as Predictors of Real-Time Economic Activity: A Bayesian Model-Averaging Approach," The Review of Economics and Statistics, MIT Press, vol. 95(5), pages 1501-1519, December.
    25. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2011. "Inference for high-dimensional sparse econometric models," CeMMAP working papers CWP41/11, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    26. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach," Annual Review of Economics, Annual Reviews, vol. 7(1), pages 649-688, August.
    27. Joachim Freyberger & Andreas Neuhierl & Michael Weber, 2020. "Dissecting Characteristics Nonparametrically," Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2326-2377.
    28. Sainan Jin & Liangjun Su & Aman Ullah, 2014. "Robustify Financial Time Series Forecasting with Bagging," Econometric Reviews, Taylor & Francis Journals, vol. 33(5-6), pages 575-605, August.
    29. Susan Athey & Mohsen Bayati & Guido Imbens & Zhaonan Qu, 2019. "Ensemble Methods for Causal Effects in Panel Data Settings," AEA Papers and Proceedings, American Economic Association, vol. 109, pages 65-70, May.
    30. Alberto Abadie & Maximilian Kasy, 2019. "Choosing Among Regularized Estimators in Empirical Economics: The Risk of Machine Learning," The Review of Economics and Statistics, MIT Press, vol. 101(5), pages 743-762, December.
    31. A. Belloni & V. Chernozhukov & L. Wang, 2011. "Square-root lasso: pivotal recovery of sparse signals via conic programming," Biometrika, Biometrika Trust, vol. 98(4), pages 791-806.
    32. Park, Trevor & Casella, George, 2008. "The Bayesian Lasso," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 681-686, June.
    33. Leeb, Hannes & Pötscher, Benedikt M., 2008. "Can One Estimate The Unconditional Distribution Of Post-Model-Selection Estimators?," Econometric Theory, Cambridge University Press, vol. 24(2), pages 338-376, April.
    34. De Mol, Christine & Giannone, Domenico & Reichlin, Lucrezia, 2008. "Forecasting using a large number of predictors: Is Bayesian shrinkage a valid alternative to principal components?," Journal of Econometrics, Elsevier, vol. 146(2), pages 318-328, October.
    35. Xavier Sala-I-Martin & Gernot Doppelhofer & Ronald I. Miller, 2004. "Determinants of Long-Term Growth: A Bayesian Averaging of Classical Estimates (BACE) Approach," American Economic Review, American Economic Association, vol. 94(4), pages 813-835, September.
    36. Liang, Feng & Paulo, Rui & Molina, German & Clyde, Merlise A. & Berger, Jim O., 2008. "Mixtures of g Priors for Bayesian Variable Selection," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 410-423, March.
    37. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    38. Jushan Bai & Serena Ng, 2009. "Boosting diffusion indices," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 24(4), pages 607-629.
    39. K. J. Martijn Cremers, 2002. "Stock Return Predictability: A Bayesian Model Selection Perspective," Review of Financial Studies, Society for Financial Studies, vol. 15(4), pages 1223-1249.
    40. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniele Bianchi & Kenichiro McAlinn, 2018. "Large-Scale Dynamic Predictive Regressions," Papers 1803.06738, arXiv.org.
    2. Philippe Goulet Coulombe & Maxime Leroux & Dalibor Stevanovic & Stephane Surprenant, 2020. "How is Machine Learning Useful for Macroeconomic Forecasting?," Working Papers 20-01, Chair in macroeconomics and forecasting, University of Quebec in Montreal's School of Management, revised Aug 2020.
    3. Lee, Ji Hyung & Shi, Zhentao & Gao, Zhan, 2022. "On LASSO for predictive regression," Journal of Econometrics, Elsevier, vol. 229(2), pages 322-349.
    4. Daniel Borup & Bent Jesper Christensen & Nicolaj N{o}rgaard Muhlbach & Mikkel Slot Nielsen, 2020. "Targeting predictors in random forest regression," Papers 2004.01411, arXiv.org, revised Nov 2020.
    5. Philippe Goulet Coulombe & Maxime Leroux & Dalibor Stevanovic & Stéphane Surprenant, 2022. "How is machine learning useful for macroeconomic forecasting?," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 920-964, August.
    6. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    7. Cheng, Xu & Hansen, Bruce E., 2015. "Forecasting with factor-augmented regression: A frequentist model averaging approach," Journal of Econometrics, Elsevier, vol. 186(2), pages 280-293.
    8. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    9. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2019. "Valid Post-Selection Inference in High-Dimensional Approximately Sparse Quantile Regression Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 749-758, April.
    10. Mark F. J. Steel, 2020. "Model Averaging and Its Use in Economics," Journal of Economic Literature, American Economic Association, vol. 58(3), pages 644-719, September.
    11. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2020. "Machine Learning Advances for Time Series Forecasting," Papers 2012.12802, arXiv.org, revised Apr 2021.
    12. Byron Botha & Rulof Burger & Kevin Kotze & Neil Rankin & Daan Steenkamp, 2022. "Big data forecasting of South African inflation," School of Economics Macroeconomic Discussion Paper Series 2022-03, School of Economics, University of Cape Town.
    13. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Robust inference in high-dimensional approximately sparse quantile regression models," CeMMAP working papers CWP70/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    14. Gary Koop, 2012. "Using VARs and TVP-VARs with Many Macroeconomic Variables," Central European Journal of Economic Modelling and Econometrics, Central European Journal of Economic Modelling and Econometrics, vol. 4(3), pages 143-167, September.
    15. Dimitris Korobilis, 2018. "Machine Learning Macroeconometrics: A Primer," Working Paper series 18-30, Rimini Centre for Economic Analysis.
    16. Mykola Babiak & Jozef Barunik, 2020. "Deep Learning, Predictability, and Optimal Portfolio Returns," CERGE-EI Working Papers wp677, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
    17. Cross, Jamie L. & Hou, Chenghan & Poon, Aubrey, 2020. "Macroeconomic forecasting with large Bayesian VARs: Global-local priors and the illusion of sparsity," International Journal of Forecasting, Elsevier, vol. 36(3), pages 899-915.
    18. Croux, Christophe & Jagtiani, Julapa & Korivi, Tarunsai & Vulanovic, Milos, 2020. "Important factors determining Fintech loan default: Evidence from a lendingclub consumer platform," Journal of Economic Behavior & Organization, Elsevier, vol. 173(C), pages 270-296.
    19. Tommaso Proietti, 2016. "On the Selection of Common Factors for Macroeconomic Forecasting," Advances in Econometrics, in: Dynamic Factor Models, volume 35, pages 593-628, Emerald Group Publishing Limited.
    20. Ning Xu & Jian Hong & Timothy C. G. Fisher, 2016. "Model selection consistency from the perspective of generalization ability and VC theory with an application to Lasso," Papers 1606.00142, arXiv.org.

    More about this item

    Keywords

    model selection; shrinkage; high dimensional data;
    All these keywords.

    JEL classification:

    • C11 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Bayesian Analysis: General
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fip:fednsr:847. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: https://edirc.repec.org/data/frbnyus.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Gabriella Bucciarelli (email available below). General contact details of provider: https://edirc.repec.org/data/frbnyus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.