IDEAS home Printed from https://ideas.repec.org/p/zbw/rwirep/964.html
   My bibliography  Save this paper

Text data rule - don't they? A study on the (additional) information of Handelsblatt data for nowcasting German GDP in comparison to established economic indicators

Author

Listed:
  • Shrub, Yuliya
  • Rieger, Jonas
  • Müller, Henrik
  • Jentsch, Carsten

Abstract

The prompt availability of information on the current state of the economy in real-time is required for prediction purposes and crucial for timely policy adjustment and economic decision-making. While important macroeconomic indicators are reported only quarterly and also published with substantial delay, other related data are available more frequently, that is monthly, weekly, daily or even more often. In this regard, the goal of nowcasting methods is to make use of such more frequently collected variables to update predictions of less often reported variables such as e.g. GDP growth. In this paper, we propose a mixed-frequency model to investigate the potential of using text data in form of newspaper articles for nowcasting German GDP growth. Newspaper text data appears to be very helpful in this regard as it directly explains economic and social progress influencing GDP growth and as it is updated frequently without any substantial delay. We compare several setups based on commonly used macro variables with and without additionally included information from text data (extracted in an unsupervised manner) as well as a setup only based on such text data. To deal with the high dimensionality of the considered data, we make use of principal component regression, penalization techniques and random forest. Comparing our results leads to the conclusion that there are certain benefits achievable when text data are included for nowcasting, but the unsupervised extraction of information from text data tends to still contain too much irrelevant noise hampering the performance of the resulting nowcasting approach.

Suggested Citation

  • Shrub, Yuliya & Rieger, Jonas & Müller, Henrik & Jentsch, Carsten, 2022. "Text data rule - don't they? A study on the (additional) information of Handelsblatt data for nowcasting German GDP in comparison to established economic indicators," Ruhr Economic Papers 964, RWI - Leibniz-Institut für Wirtschaftsforschung, Ruhr-University Bochum, TU Dortmund University, University of Duisburg-Essen.
  • Handle: RePEc:zbw:rwirep:964
    DOI: 10.4419/96973128
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/264400/1/1816318698.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.4419/96973128?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Andrea Carriero & Todd E. Clark & Massimiliano Marcellino, 2015. "Realtime nowcasting with a Bayesian mixed frequency model with stochastic volatility," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 178(4), pages 837-862, October.
    2. Zeileis, Achim & Grothendieck, Gabor, 2005. "zoo: S3 Infrastructure for Regular and Irregular Time Series," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 14(i06).
    3. Robert Lehmann, 2023. "The Forecasting Power of the ifo Business Survey," Journal of Business Cycle Research, Springer;Centre for International Research on Economic Tendency Surveys (CIRET), vol. 19(1), pages 43-94, March.
    4. Eleni Kalamara & Arthur Turrell & Chris Redl & George Kapetanios & Sujit Kapadia, 2022. "Making text count: Economic forecasting using newspaper text," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 896-919, August.
    5. Leif Anders Thorsrud, 2020. "Words are the New Numbers: A Newsy Coincident Index of the Business Cycle," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 38(2), pages 393-409, April.
    6. Ghysels, Eric & Santa-Clara, Pedro & Valkanov, Rossen, 2004. "The MIDAS Touch: Mixed Data Sampling Regression Models," University of California at Los Angeles, Anderson Graduate School of Management qt9mf223rs, Anderson Graduate School of Management, UCLA.
    7. Bec, Frédérique & Mogliani, Matteo, 2015. "Nowcasting French GDP in real-time with surveys and “blocked” regressions: Combining forecasts or pooling information?," International Journal of Forecasting, Elsevier, vol. 31(4), pages 1021-1042.
    8. Saiz, Lorena & Ashwin, Julian & Kalamara, Eleni, 2021. "Nowcasting euro area GDP with news sentiment: a tale of two crises," Working Paper Series 2616, European Central Bank.
    9. Martin Haselmayer & Marcelo Jenny, 2017. "Sentiment analysis of political communication: combining a dictionary approach with crowdcoding," Quality & Quantity: International Journal of Methodology, Springer, vol. 51(6), pages 2623-2646, November.
    10. Ardia, David & Bluteau, Keven & Boudt, Kris, 2019. "Questioning the news about economic growth: Sparse forecasting using thousands of news-based sentiment values," International Journal of Forecasting, Elsevier, vol. 35(4), pages 1370-1386.
    11. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    12. Alessandro Girardi & Christian Gayer & Andreas Reuter, 2016. "The Role of Survey Data in Nowcasting Euro Area GDP Growth," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 35(5), pages 400-418, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2020. "News Media vs. FRED-MD for Macroeconomic Forecasting," CESifo Working Paper Series 8639, CESifo.
    2. Aprigliano, Valentina & Emiliozzi, Simone & Guaitoli, Gabriele & Luciani, Andrea & Marcucci, Juri & Monteforte, Libero, 2023. "The power of text-based indicators in forecasting Italian economic activity," International Journal of Forecasting, Elsevier, vol. 39(2), pages 791-808.
    3. Mogliani, Matteo & Darné, Olivier & Pluyaud, Bertrand, 2017. "The new MIBA model: Real-time nowcasting of French GDP using the Banque de France's monthly business survey," Economic Modelling, Elsevier, vol. 64(C), pages 26-39.
    4. Dorinth van Dijk & Jasper de Winter, 2023. "Nowcasting GDP using tone-adjusted time varying news topics: Evidence from the financial press," Working Papers 766, DNB.
    5. Knut Are Aastveit & Tuva Marie Fastbø & Eleonora Granziera & Kenneth Sæterhagen Paulsen & Kjersti Næss Torstensen, 2020. "Nowcasting Norwegian household consumption with debit card transaction data," Working Paper 2020/17, Norges Bank.
    6. Kenichiro McAlinn, 2021. "Mixed‐frequency Bayesian predictive synthesis for economic nowcasting," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(5), pages 1143-1163, November.
    7. Mikhaylov, Dmitry, 2023. "Macroeconomic Forecasting with the Use of News Data," Working Papers w20220250, Russian Presidential Academy of National Economy and Public Administration.
    8. Erik Andres-Escayola & Corinna Ghirelli & Luis Molina & Javier J. Pérez & Elena Vidal, 2022. "Using newspapers for textual indicators: which and how many?," Working Papers 2235, Banco de España.
    9. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2022. "News media versus FRED‐MD for macroeconomic forecasting," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(1), pages 63-81, January.
    10. Saiz, Lorena & Ashwin, Julian & Kalamara, Eleni, 2021. "Nowcasting euro area GDP with news sentiment: a tale of two crises," Working Paper Series 2616, European Central Bank.
    11. Michael W. McCracken & Michael T. Owyang & Tatevik Sekhposyan, 2021. "Real-Time Forecasting and Scenario Analysis Using a Large Mixed-Frequency Bayesian VAR," International Journal of Central Banking, International Journal of Central Banking, vol. 17(71), pages 1-41, December.
    12. Magnus Reif, 2020. "Macroeconomics, Nonlinearities, and the Business Cycle," ifo Beiträge zur Wirtschaftsforschung, ifo Institute - Leibniz Institute for Economic Research at the University of Munich, number 87.
    13. Massimo Ferrari Minesso & Laura Lebastard & Helena Mezo, 2023. "Text-Based Recession Probabilities," IMF Economic Review, Palgrave Macmillan;International Monetary Fund, vol. 71(2), pages 415-438, June.
    14. Boriss Siliverstovs, 2017. "Short-term forecasting with mixed-frequency data: a MIDASSO approach," Applied Economics, Taylor & Francis Journals, vol. 49(13), pages 1326-1343, March.
    15. Andrea Carriero & Todd E. Clark & Marcellino Massimiliano, 2020. "Nowcasting Tail Risks to Economic Activity with Many Indicators," Working Papers 20-13R2, Federal Reserve Bank of Cleveland, revised 22 Sep 2020.
    16. Deimante Teresiene & Greta Keliuotyte-Staniuleniene & Yiyi Liao & Rasa Kanapickiene & Ruihui Pu & Siyan Hu & Xiao-Guang Yue, 2021. "The Impact of the COVID-19 Pandemic on Consumer and Business Confidence Indicators," JRFM, MDPI, vol. 14(4), pages 1-23, April.
    17. Mogliani, Matteo & Simoni, Anna, 2021. "Bayesian MIDAS penalized regressions: Estimation, selection, and prediction," Journal of Econometrics, Elsevier, vol. 222(1), pages 833-860.
    18. Robert Lehmann & Sascha Möhrle, 2022. "Forecasting Regional Industrial Production with High-Frequency Electricity Consumption Data," CESifo Working Paper Series 9917, CESifo.
    19. Robert M. Kunst & Martin Wagner, 2020. "Economic forecasting: editors’ introduction," Empirical Economics, Springer, vol. 58(1), pages 1-5, January.
    20. Marozzi, Armando, 2021. "The ECB's tracker: nowcasting the press conferences of the ECB," Working Paper Series 2609, European Central Bank.

    More about this item

    Keywords

    Topic model; latent Dirichlet allocation; text mining; econometrics; gross domestic product; prediction; forecast;
    All these keywords.

    JEL classification:

    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis
    • E37 - Macroeconomics and Monetary Economics - - Prices, Business Fluctuations, and Cycles - - - Forecasting and Simulation: Models and Applications

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:rwirep:964. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/rwiesde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.