IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2404.18499.html
   My bibliography  Save this paper

Quantitative Tools for Time Series Analysis in Natural Language Processing: A Practitioners Guide

Author

Listed:
  • W. Benedikt Schmal

Abstract

Natural language processing tools have become frequently used in social sciences such as economics, political science, and sociology. Many publications apply topic modeling to elicit latent topics in text corpora and their development over time. Here, most publications rely on visual inspections and draw inference on changes, structural breaks, and developments over time. We suggest using univariate time series econometrics to introduce more quantitative rigor that can strengthen the analyses. In particular, we discuss the econometric topics of non-stationarity as well as structural breaks. This paper serves as a comprehensive practitioners guide to provide researchers in the social and life sciences as well as the humanities with concise advice on how to implement econometric time series methods to thoroughly investigate topic prevalences over time. We provide coding advice for the statistical software R throughout the paper. The application of the discussed tools to a sample dataset completes the analysis.

Suggested Citation

  • W. Benedikt Schmal, 2024. "Quantitative Tools for Time Series Analysis in Natural Language Processing: A Practitioners Guide," Papers 2404.18499, arXiv.org.
  • Handle: RePEc:arx:papers:2404.18499
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2404.18499
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fischer, Kai & Reade, J. James & Schmal, W. Benedikt, 2022. "What cannot be cured must be endured: The long-lasting effect of a COVID-19 infection on workplace productivity," Labour Economics, Elsevier, vol. 79(C).
    2. Serena Ng & Pierre Perron, 2001. "LAG Length Selection and the Construction of Unit Root Tests with Good Size and Power," Econometrica, Econometric Society, vol. 69(6), pages 1519-1554, November.
    3. Zeileis, Achim & Leisch, Friedrich & Hornik, Kurt & Kleiber, Christian, 2002. "strucchange: An R Package for Testing for Structural Change in Linear Regression Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 7(i02).
    4. Whitney K. Newey & Kenneth D. West, 1994. "Automatic Lag Selection in Covariance Matrix Estimation," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 61(4), pages 631-653.
    5. Schmal, W. Benedikt & Haucap, Justus & Knoke, Leon, 2023. "The role of gender and coauthors in academic publication behavior," Research Policy, Elsevier, vol. 52(10).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Venus Khim-Sen Liew & Hock-Ann Lee & Kian-Ping Lim & Huay-Huay Lee, 2008. "Linearity and Stationarity of South Asian Real Exchange Rates," The IUP Journal of Applied Economics, IUP Publications, vol. 0(5), pages 48-58, September.
    2. Eleni Constantinou & Robert Georgiades & Avo Kazandjian & George Kouretas, 2005. "Mean and variance causality between the Cyprus Stock Exchange and major equity markets," Working Papers 0501, University of Crete, Department of Economics.
    3. Bampinas, Georgios & Panagiotidis, Theodore, 2016. "Hedging inflation with individual US stocks: A long-run portfolio analysis," The North American Journal of Economics and Finance, Elsevier, vol. 37(C), pages 374-392.
    4. Nikolay Gospodinov & Ian Irvine, 2005. "A ‘long march’ perspective on tobacco use in Canada," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 38(2), pages 366-393, May.
    5. Eleni Constantinou & Avo Kazandjian & Georgios P. Kouretas & Vera Tahmazian, 2008. "Common Stochastic Trends Among The Cyprus Stock Exchange And The Ase, Lse And Nyse," Bulletin of Economic Research, Wiley Blackwell, vol. 60(4), pages 327-349, October.
    6. Diamandis, Panayiotis F., 2009. "International stock market linkages: Evidence from Latin America," Global Finance Journal, Elsevier, vol. 20(1), pages 13-30.
    7. Baumöhl, Eduard & Lyócsa, Štefan, 2012. "Constructing weekly returns based on daily stock market data: A puzzle for empirical research?," MPRA Paper 43431, University Library of Munich, Germany.
    8. Michael Jansson & Marcelo J. Moreira, 2006. "Optimal Inference in Regression Models with Nearly Integrated Regressors," Econometrica, Econometric Society, vol. 74(3), pages 681-714, May.
    9. Yau, Hwey-Yun & Nieh, Chien-Chung, 2006. "Interrelationships among stock prices of Taiwan and Japan and NTD/Yen exchange rate," Journal of Asian Economics, Elsevier, vol. 17(3), pages 535-552, June.
    10. Helble, Matthias & Ngiang, Boon-Loong, 2016. "From global factory to global mall? East Asia’s changing trade composition and orientation," Japan and the World Economy, Elsevier, vol. 39(C), pages 37-47.
    11. Carrion-i-Silvestre, Josep Lluís & Gadea, María Dolores, 2013. "GLS-based unit root tests for bounded processes," Economics Letters, Elsevier, vol. 120(2), pages 184-187.
    12. Fumitaka Furuoka, 2017. "Unemployment Dynamics In The Asia-Pacific Region: A Preliminary Investigation," The Singapore Economic Review (SER), World Scientific Publishing Co. Pte. Ltd., vol. 62(05), pages 983-1016, December.
    13. Zeileis, Achim, 2004. "Econometric Computing with HC and HAC Covariance Matrix Estimators," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 11(i10).
    14. Juan A. Román Aso & Jaime Vallés Giménez, 2016. "Air Emissions Performance: A Dynamic Analysis for Spain," Hacienda Pública Española / Review of Public Economics, IEF, vol. 218(3), pages 57-78, September.
    15. Patrick Kline, 2008. "Understanding Sectoral Labor Market Dynamics: An Equilibrium Analysis of the Oil and Gas Field Services Industry," Cowles Foundation Discussion Papers 1645, Cowles Foundation for Research in Economics, Yale University.
    16. Baum, Christopher F. & Barkoulas, John, 2006. "Dynamics of Intra-EMS Interest Rate Linkages," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 38(2), pages 469-482, March.
    17. Chien-Chung Nieh & Hwey-Yun Yau & Ken Hung & Hong-Kou Ou & Shine Hung, 2013. "Cointegration and causal relationships among steel prices of Mainland China, Taiwan, and USA in the presence of multiple structural changes," Empirical Economics, Springer, vol. 44(2), pages 545-561, April.
    18. Jorge Andrés Tamayo Castaño, 2012. "Asimetrías en la demanda por trabajo en Colombia: el papel del ciclo económico," Borradores de Economia 689, Banco de la Republica de Colombia.
    19. Carlo Fezzi & Derek Bunn, 2010. "Structural Analysis of Electricity Demand and Supply Interactions," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 72(6), pages 827-856, December.
    20. Paul Alagidede & Theodore Panagiotidis & Xu Zhang, 2011. "Causal relationship between stock prices and exchange rates," The Journal of International Trade & Economic Development, Taylor & Francis Journals, vol. 20(1), pages 67-86.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2404.18499. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.