IDEAS home Printed from https://ideas.repec.org/p/sza/wpaper/wpapers366.html
   My bibliography  Save this paper

A machine learning approach to domain specific dictionary generation. An economic time series framework

Author

Listed:
  • Hanjo Odendaal

    (Department of Economics, Stellenbosch University)

Abstract

This paper aims to offer an alternative to the manually labour intensive process of constructing a domain specific lexicon or dictionary through the operationalization of subjective information processing. This paper builds on current empirical literature by (a) constructing a domain specific dictionary for various economic confidence indices, (b) introducing a novel weighting schema of text tokens that account for time dependence; and (c) operationalising subjective information processing of text data using machine learning. The results show that sentiment indices constructed from machine generated dictionaries have a better fit with multiple indicators of economic activity than @loughran2011liability's manually constructed dictionary. Analysis shows a lower RMSE for the domain specific dictionaries in a five year holdout sample period from 2012 to 2017. The results also justify the time series weighting design used to overcome the p>>n problem, commonly found when working with economic time series and text data.

Suggested Citation

  • Hanjo Odendaal, 2021. "A machine learning approach to domain specific dictionary generation. An economic time series framework," Working Papers 06/2021, Stellenbosch University, Department of Economics.
  • Handle: RePEc:sza:wpaper:wpapers366
    as

    Download full text from publisher

    File URL: https://www.ekon.sun.ac.za/wpapers/2021/wp062021/wp062021.pdf
    File Function: First version, 2021
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Leduc, Sylvain & Sill, Keith & Stark, Tom, 2007. "Self-fulfilling expectations and the inflation of the 1970s: Evidence from the Livingston Survey," Journal of Monetary Economics, Elsevier, vol. 54(2), pages 433-459, March.
    2. N. Gregory Mankiw & Ricardo Reis, 2002. "Sticky Information versus Sticky Prices: A Proposal to Replace the New Keynesian Phillips Curve," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 117(4), pages 1295-1328.
    3. Hanjo Odendaal & Monique Reid & Johann F. Kirsten, 2020. "Media‐Based Sentiment Indices as an Alternative Measure of Consumer Confidence," South African Journal of Economics, Economic Society of South Africa, vol. 88(4), pages 409-434, December.
    4. Lamla, Michael J. & Lein, Sarah M., 2014. "The role of media for consumers’ inflation expectation formation," Journal of Economic Behavior & Organization, Elsevier, vol. 106(C), pages 62-77.
    5. Eleni Kalamara & Arthur Turrell & Chris Redl & George Kapetanios & Sujit Kapadia, 2022. "Making text count: Economic forecasting using newspaper text," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(5), pages 896-919, August.
    6. George A. Akerlof & William T. Dickens & George L. Perry, 2000. "Near-Rational Wage and Price Setting and the Long-Run Phillips Curve," Brookings Papers on Economic Activity, Economic Studies Program, The Brookings Institution, vol. 31(1), pages 1-60.
    7. Stuart N. Soroka & Dominik A. Stecula & Christopher Wlezien, 2015. "It's (Change in) the (Future) Economy, Stupid: Economic Indicators, the Media, and Public Opinion," American Journal of Political Science, John Wiley & Sons, vol. 59(2), pages 457-474, February.
    8. Christopher D. Carroll, 2003. "Macroeconomic Expectations of Households and Professional Forecasters," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 118(1), pages 269-298.
    9. Sims, Christopher A., 2003. "Implications of rational inattention," Journal of Monetary Economics, Elsevier, vol. 50(3), pages 665-690, April.
    10. Tim Loughran & Bill Mcdonald, 2011. "When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks," Journal of Finance, American Finance Association, vol. 66(1), pages 35-65, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bennani, Hamza, 2018. "Media coverage and ECB policy-making: Evidence from an augmented Taylor rule," Journal of Macroeconomics, Elsevier, vol. 57(C), pages 26-38.
    2. Sarah M. Lein & Thomas Maag, 2011. "The Formation Of Inflation Perceptions: Some Empirical Facts For European Countries," Scottish Journal of Political Economy, Scottish Economic Society, vol. 58(2), pages 155-188, May.
    3. Lena Draeger, 2011. "Endogenous persistence with recursive inattentiveness," KOF Working papers 11-285, KOF Swiss Economic Institute, ETH Zurich.
    4. Larsen, Vegard H. & Thorsrud, Leif Anders & Zhulanova, Julia, 2021. "News-driven inflation expectations and information rigidities," Journal of Monetary Economics, Elsevier, vol. 117(C), pages 507-520.
    5. Baranowski, Paweł & Doryń, Wirginia & Łyziak, Tomasz & Stanisławska, Ewa, 2021. "Words and deeds in managing expectations: Empirical evidence from an inflation targeting economy," Economic Modelling, Elsevier, vol. 95(C), pages 49-67.
    6. Michael J. Lamla & Thomas Maag, 2012. "The Role of Media for Inflation Forecast Disagreement of Households and Professional Forecasters," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 44(7), pages 1325-1350, October.
    7. de Mendonça, Helder Ferreira & Vereda, Luciano & Araujo, Mateus de Azevedo, 2022. "What type of information calls the attention of forecasters? Evidence from survey data in an emerging market," Journal of International Money and Finance, Elsevier, vol. 129(C).
    8. Juan Camilo Anzoátegui-Zapata & Juan Camilo Galvis-Ciro, 2020. "Disagreements in Consumer Inflation Expectations: Empirical Evidence for a Latin American Economy," Journal of Business Cycle Research, Springer;Centre for International Research on Economic Tendency Surveys (CIRET), vol. 16(2), pages 99-122, November.
    9. Kose, M. Ayhan & Matsuoka, Hideaki & Panizza, Ugo & Vorisek, Dana, 2019. "Inflation Expectations: Review and Evidence," CEPR Discussion Papers 13601, C.E.P.R. Discussion Papers.
    10. Claus, Edda & Nguyen, Viet Hoang, 2018. "Consumptor economicus: How do consumers form expectations on economic variables?," Journal of Economic Behavior & Organization, Elsevier, vol. 152(C), pages 254-275.
    11. Vegard H�ghaug Larsen & Leif Anders Thorsrud, 2018. "Business cycle narratives," Working Papers No 6/2018, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    12. Yingying Xu & Zhi-Xin Liu & Hsu-Ling Chang & Adelina Dumitrescu Peculea & Chi-Wei Su, 2017. "Does self-fulfilment of the inflation expectation exist?," Applied Economics, Taylor & Francis Journals, vol. 49(11), pages 1098-1113, March.
    13. Yingying Xu & Zhixin Liu & Zichao Jia & Chi-Wei Su, 2017. "Is time-variant information stickiness state-dependent?," Portuguese Economic Journal, Springer;Instituto Superior de Economia e Gestao, vol. 16(3), pages 169-187, December.
    14. Lamla, Michael J. & Lein, Sarah M., 2014. "The role of media for consumers’ inflation expectation formation," Journal of Economic Behavior & Organization, Elsevier, vol. 106(C), pages 62-77.
    15. Pfajfar, D. & Santoro, E., 2008. "Asymmetries in Inflation Expectation Formation Across Demographic Groups," Cambridge Working Papers in Economics 0824, Faculty of Economics, University of Cambridge.
    16. Menz, Jan-Oliver & Poppitz, Philipp, 2013. "Households' disagreement on inflation expectations and socioeconomic media exposure in Germany," Discussion Papers 27/2013, Deutsche Bundesbank.
    17. Lena Draeger & Michael J. Lamla, 2013. "Imperfect information and inflation expectations," KOF Working papers 13-329, KOF Swiss Economic Institute, ETH Zurich.
    18. Benjamin Beckers & Konstantin A. Kholodilin & Dirk Ulbricht, 2017. "Reading between the Lines: Using Media to Improve German Inflation Forecasts," Discussion Papers of DIW Berlin 1665, DIW Berlin, German Institute for Economic Research.
    19. Hamza Bennani, 2016. "Media Coverage and ECB Policy-Making: Evidence from a New Index," Working Papers hal-04141572, HAL.
    20. Hamza Bennani, 2016. "Media Coverage and ECB Policy-Making: Evidence from a New Index," EconomiX Working Papers 2016-38, University of Paris Nanterre, EconomiX.

    More about this item

    Keywords

    Sentometrics; Machine learning; Domain-specific dictionaries;
    All these keywords.

    JEL classification:

    • C32 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Time-Series Models; Dynamic Quantile Regressions; Dynamic Treatment Effect Models; Diffusion Processes; State Space Models
    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sza:wpaper:wpapers366. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Melt van Schoor (email available below). General contact details of provider: https://edirc.repec.org/data/desunza.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.