IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2404.18499.html
   My bibliography  Save this paper

Quantitative Tools for Time Series Analysis in Natural Language Processing: A Practitioners Guide

Author

Listed:
  • W. Benedikt Schmal

Abstract

Natural language processing tools have become frequently used in social sciences such as economics, political science, and sociology. Many publications apply topic modeling to elicit latent topics in text corpora and their development over time. Here, most publications rely on visual inspections and draw inference on changes, structural breaks, and developments over time. We suggest using univariate time series econometrics to introduce more quantitative rigor that can strengthen the analyses. In particular, we discuss the econometric topics of non-stationarity as well as structural breaks. This paper serves as a comprehensive practitioners guide to provide researchers in the social and life sciences as well as the humanities with concise advice on how to implement econometric time series methods to thoroughly investigate topic prevalences over time. We provide coding advice for the statistical software R throughout the paper. The application of the discussed tools to a sample dataset completes the analysis.

Suggested Citation

  • W. Benedikt Schmal, 2024. "Quantitative Tools for Time Series Analysis in Natural Language Processing: A Practitioners Guide," Papers 2404.18499, arXiv.org.
  • Handle: RePEc:arx:papers:2404.18499
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2404.18499
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fischer, Kai & Reade, J. James & Schmal, W. Benedikt, 2022. "What cannot be cured must be endured: The long-lasting effect of a COVID-19 infection on workplace productivity," Labour Economics, Elsevier, vol. 79(C).
    2. Serena Ng & Pierre Perron, 2001. "LAG Length Selection and the Construction of Unit Root Tests with Good Size and Power," Econometrica, Econometric Society, vol. 69(6), pages 1519-1554, November.
    3. Zeileis, Achim & Leisch, Friedrich & Hornik, Kurt & Kleiber, Christian, 2002. "strucchange: An R Package for Testing for Structural Change in Linear Regression Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 7(i02).
    4. Whitney K. Newey & Kenneth D. West, 1994. "Automatic Lag Selection in Covariance Matrix Estimation," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 61(4), pages 631-653.
    5. Schmal, W. Benedikt & Haucap, Justus & Knoke, Leon, 2023. "The role of gender and coauthors in academic publication behavior," Research Policy, Elsevier, vol. 52(10).
    6. Kwiatkowski, Denis & Phillips, Peter C. B. & Schmidt, Peter & Shin, Yongcheol, 1992. "Testing the null hypothesis of stationarity against the alternative of a unit root : How sure are we that economic time series have a unit root?," Journal of Econometrics, Elsevier, vol. 54(1-3), pages 159-178.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Venus Khim-Sen Liew & Hock-Ann Lee & Kian-Ping Lim & Huay-Huay Lee, 2008. "Linearity and Stationarity of South Asian Real Exchange Rates," The IUP Journal of Applied Economics, IUP Publications, vol. 0(5), pages 48-58, September.
    2. Eleni Constantinou & Robert Georgiades & Avo Kazandjian & George Kouretas, 2005. "Mean and variance causality between the Cyprus Stock Exchange and major equity markets," Working Papers 0501, University of Crete, Department of Economics.
    3. Eleni Constantinou & Avo Kazandjian & Georgios P. Kouretas & Vera Tahmazian, 2008. "Common Stochastic Trends Among The Cyprus Stock Exchange And The Ase, Lse And Nyse," Bulletin of Economic Research, Wiley Blackwell, vol. 60(4), pages 327-349, October.
    4. Diamandis, Panayiotis F., 2009. "International stock market linkages: Evidence from Latin America," Global Finance Journal, Elsevier, vol. 20(1), pages 13-30.
    5. Juan A. Román Aso & Jaime Vallés Giménez, 2016. "Air Emissions Performance: A Dynamic Analysis for Spain," Hacienda Pública Española / Review of Public Economics, IEF, vol. 218(3), pages 57-78, September.
    6. Baum, Christopher F. & Barkoulas, John, 2006. "Dynamics of Intra-EMS Interest Rate Linkages," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 38(2), pages 469-482, March.
    7. Jorge Andrés Tamayo Castaño, 2012. "Asimetrías en la demanda por trabajo en Colombia: el papel del ciclo económico," Borradores de Economia 689, Banco de la Republica de Colombia.
    8. Caporale, Guglielmo Maria & Kontonikas, Alexandros, 2009. "The Euro and inflation uncertainty in the European Monetary Union," Journal of International Money and Finance, Elsevier, vol. 28(6), pages 954-971, October.
    9. Hirukawa, Masayuki, 2023. "Robust Covariance Matrix Estimation in Time Series: A Review," Econometrics and Statistics, Elsevier, vol. 27(C), pages 36-61.
    10. André Varella Mollick & João Ricardo Faria & Pedro H. Albuquerque & Miguel A. León-Ledesma, 2008. "Can globalisation stop the decline in commodities' terms of trade?," Cambridge Journal of Economics, Cambridge Political Economy Society, vol. 32(5), pages 683-701, September.
    11. Torben G. Andersen & Rasmus T. Varneskov, 2018. "Consistent Inference for Predictive Regressions in Persistent VAR Economies," CREATES Research Papers 2018-09, Department of Economics and Business Economics, Aarhus University.
    12. Harvey, David I. & Leybourne, Stephen J. & Taylor, A.M. Robert, 2009. "Unit Root Testing In Practice: Dealing With Uncertainty Over The Trend And Initial Condition," Econometric Theory, Cambridge University Press, vol. 25(3), pages 587-636, June.
    13. Graham M. Voss & M. Chaban, 2012. "National and Provincial Inflation in Canada: Experiences under Inflation Targeting," Department Discussion Papers 1201, Department of Economics, University of Victoria.
    14. María Dolores Gadea & Laura Mayoral, 2006. "The Persistence of Inflation in OECD Countries: A Fractionally Integrated Approach," International Journal of Central Banking, International Journal of Central Banking, vol. 2(1), March.
    15. Arghyrou, Michael G. & Gadea, Maria Dolores, 2012. "The single monetary policy and domestic macro-fundamentals: Evidence from Spain," Journal of Policy Modeling, Elsevier, vol. 34(1), pages 16-34.
    16. Hsu, Yi-Chung & Lee, Chien-Chiang & Lee, Chi-Chuan, 2008. "Revisited: Are shocks to energy consumption permanent or temporary? New evidence from a panel SURADF approach," Energy Economics, Elsevier, vol. 30(5), pages 2314-2330, September.
    17. Harvey, David I. & Leybourne, Stephen J. & Taylor, A.M. Robert, 2007. "A simple, robust and powerful test of the trend hypothesis," Journal of Econometrics, Elsevier, vol. 141(2), pages 1302-1330, December.
    18. Sabate, Marcela & Gadea, Maria Dolores & Escario, Regina, 2006. "Does fiscal policy influence monetary policy? The case of Spain, 1874-1935," Explorations in Economic History, Elsevier, vol. 43(2), pages 309-331, April.
    19. Cheng, Shu-Ching & Wu, Tsung-pao & Lee, Kuei-Chiu & Chang, Tsangyao, 2014. "Flexible Fourier unit root test of unemployment for PIIGS countries," Economic Modelling, Elsevier, vol. 36(C), pages 142-148.
    20. Anton Skrobotov, 2013. "Local Structural Trend Break in Stationarity Testing," Working Papers 0074, Gaidar Institute for Economic Policy, revised 2013.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2404.18499. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.