IDEAS home Printed from https://ideas.repec.org/p/mar/magkse/201815.html
   My bibliography  Save this paper

Measuring the Diffusion of Innovations with Paragraph Vector Topic Models

Author

Listed:
  • David Lenz

    (Justus-Liebig-University Giessen)

  • Peter Winker

    (Justus-Liebig-University Giessen)

Abstract

Measuring the diffusion of innovations from textual data sources besides patent data has not been studied extensively. However, early and accurate indicators of innovation and the recognition of trends in innovation are mandatory to successfully promote economic growth through technological progress via evidence-based policy making. In this study, we propose Paragraph Vector Topic Model (PVTM) and apply it on technology related news articles to analyze innovation related topics over time and gain insights regarding their diffusion process. PVTM represents documents in a semantic space, which has been shown to capture latent variables of the underlying documents, e.g. the latent topics. Clusters of documents in the semantic space can then be interpreted and transformed into meaningful topics by means of Gaussian mixture modeling. Using PVTM we identify innovation related topics from 170 thousand technology news articles published over a span of 20 years and gather insights about their diffusion state by measuring the topics importance in the corpus over time. Thereby, we find that PVTM diffusion indicators for certain topics are Granger causal to Google Trends indices with matching search terms. Further, our results suggest PVTM is well suited to discover latent topics in (technology related) news articles and that the diffusion of innovations could be assessed using topic importance measures derived from PVTM.

Suggested Citation

  • David Lenz & Peter Winker, 2018. "Measuring the Diffusion of Innovations with Paragraph Vector Topic Models," MAGKS Papers on Economics 201815, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
  • Handle: RePEc:mar:magkse:201815
    as

    Download full text from publisher

    File URL: http://www.uni-marburg.de/fb02/makro/forschung/magkspapers/paper_2018/15-2018_lenz.pdf
    File Function: First 201815
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Lüdering Jochen & Winker Peter, 2016. "Forward or Backward Looking? The Economic Discourse and the Observed Reality," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(4), pages 483-515, August.
    2. Bryan Kelly & Dimitris Papanikolaou & Amit Seru & Matt Taddy, 2021. "Measuring Technological Innovation over the Long Run," American Economic Review: Insights, American Economic Association, vol. 3(3), pages 303-320, September.
    3. Stephen Hansen & Michael McMahon & Andrea Prat, 2018. "Transparency and Deliberation Within the FOMC: A Computational Linguistics Approach," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 133(2), pages 801-870.
    4. Leah G. Nichols, 2014. "A topic model approach to measuring interdisciplinarity at the National Science Foundation," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(3), pages 741-754, September.
    5. Lino Wehrheim, 2019. "Economic history goes digital: topic modeling the Journal of Economic History," Cliometrica, Springer;Cliometric Society (Association Francaise de Cliométrie), vol. 13(1), pages 83-125, January.
    6. Hansen, Stephen & McMahon, Michael, 2016. "Shocking language: Understanding the macroeconomic effects of central bank communication," Journal of International Economics, Elsevier, vol. 99(S1), pages 114-133.
    7. Lino Wehrheim, 2019. "Economic history goes digital: topic modeling the Journal of Economic History," Cliometrica, Journal of Historical Economics and Econometric History, Association Française de Cliométrie (AFC), vol. 13(1), pages 83-125, January.
    8. Hyunyoung Choi & Hal Varian, 2012. "Predicting the Present with Google Trends," The Economic Record, The Economic Society of Australia, vol. 88(s1), pages 2-9, June.
    9. Stathoulopoulos, Kostas & Mateos-Garcia, Juan, 2017. "Mapping without a map: Exploring the UK business landscape using unsupervised learning," SocArXiv ryxdk, Center for Open Science.
    10. David Chavalarias & Jean-Philippe Cointet, 2013. "Phylomemetic Patterns in Science Evolution—The Rise and Fall of Scientific Fields," PLOS ONE, Public Library of Science, vol. 8(2), pages 1-11, February.
    11. Ryohei Hisano & Didier Sornette & Takayuki Mizuno & Takaaki Ohnishi & Tsutomu Watanabe, 2013. "High Quality Topic Extraction from Business News Explains Abnormal Financial Market Volatility," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-12, June.
    12. Antonin Bergeaud & Yoann Potiron & Juste Raimbault, 2017. "Classifying patents based on their semantic content," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-22, April.
    13. Larsen, Vegard H. & Thorsrud, Leif A., 2019. "The value of news for economic developments," Journal of Econometrics, Elsevier, vol. 210(1), pages 203-218.
    14. Vegard H. Larsen & Leif Anders Thorsrud, 2015. "The Value of News," Working Papers No 6/2015, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    15. Won Sang Lee & Hyo Shin Choi & So Young Sohn, 2018. "Forecasting new product diffusion using both patent citation and web search traffic," PLOS ONE, Public Library of Science, vol. 13(4), pages 1-12, April.
    16. Kilian,Lutz & Lütkepohl,Helmut, 2018. "Structural Vector Autoregressive Analysis," Cambridge Books, Cambridge University Press, number 9781107196575.
    17. Choi, Jinho & Hwang, Yong-Sik, 2014. "Patent keyword network analysis for improving technology development efficiency," Technological Forecasting and Social Change, Elsevier, vol. 83(C), pages 170-182.
    18. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    19. Lino Wehrheim, 2017. "Economic History Goes Digital: Topic Modeling the Journal of Economic History," Working Papers 177, Bavarian Graduate Program in Economics (BGPE).
    20. Ryohei Hisano & Didier Sornette & Takayuki Mizuno & Takaaki Ohnishi & Tsutomu Watanabe, 2012. "High quality topic extraction from business news explains abnormal financial market volatility," Papers 1210.6321, arXiv.org, revised Mar 2013.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Savin, Ivan & Ott, Ingrid & Konop, Chris, 2022. "Tracing the evolution of service robotics: Insights from a topic modeling approach," Technological Forecasting and Social Change, Elsevier, vol. 174(C).
    2. Nathan, Max & Rosso, Anna, 2022. "Innovative events: product launches, innovation and firm performance," Research Policy, Elsevier, vol. 51(1).
    3. Max Nathan & Anna Rosso, 2017. "Innovative events," Development Working Papers 429, Centro Studi Luca d'Agliano, University of Milano, revised 08 Apr 2019.
    4. Jeon, Eunji & Yoon, Naeun & Sohn, So Young, 2023. "Exploring new digital therapeutics technologies for psychiatric disorders using BERTopic and PatentSBERTa," Technological Forecasting and Social Change, Elsevier, vol. 186(PA).
    5. Ballester, Omar & Penner, Orion, 2022. "Robustness, replicability and scalability in topic modelling," Journal of Informetrics, Elsevier, vol. 16(1).
    6. Axenbeck, Janna & Breithaupt, Patrick, 2022. "Measuring the digitalisation of firms: A novel text mining approach," ZEW Discussion Papers 22-065, ZEW - Leibniz Centre for European Economic Research.
    7. Winker, Peter, 2023. "Visualizing Topic Uncertainty in Topic Modelling," VfS Annual Conference 2023 (Regensburg): Growth and the "sociale Frage" 277584, Verein für Socialpolitik / German Economic Association.
    8. Janna Axenbeck & Patrick Breithaupt, 2021. "Innovation indicators based on firm websites—Which website characteristics predict firm-level innovation activity?," PLOS ONE, Public Library of Science, vol. 16(4), pages 1-23, April.
    9. Hongshu Chen & Xinna Song & Qianqian Jin & Ximeng Wang, 2022. "Network dynamics in university-industry collaboration: a collaboration-knowledge dual-layer network perspective," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6637-6660, November.
    10. Axenbeck, Janna & Breithaupt, Patrick, 2019. "Web-based innovation indicators: Which firm website characteristics relate to firm-level innovation activity?," ZEW Discussion Papers 19-063, ZEW - Leibniz Centre for European Economic Research.
    11. Viktoriia Naboka-Krell, 2023. "Construction and Analysis of Uncertainty Indices based on Multilingual Text Representations," MAGKS Papers on Economics 202310, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    12. Dhar, Suparna & Tarafdar, Pratik & Bose, Indranil, 2022. "Understanding the evolution of an emerging technological paradigm and its impact: The case of Digital Twin," Technological Forecasting and Social Change, Elsevier, vol. 185(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Leif Anders Thorsrud, 2016. "Nowcasting using news topics Big Data versus big bank," Working Papers No 6/2016, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    2. Levy, Daniel & Mayer, Tamir & Raviv, Alon, 2022. "Economists in the 2008 financial crisis: Slow to see, fast to act," Journal of Financial Stability, Elsevier, vol. 60(C).
    3. Daniel Levy & Tamir Mayer & Alon Raviv, 2020. "Academic Scholarship in Light of the 2008 Financial Crisis: Textual Analysis of NBER Working Papers," Working Papers hal-02488796, HAL.
    4. Larsen, Vegard H. & Thorsrud, Leif Anders & Zhulanova, Julia, 2021. "News-driven inflation expectations and information rigidities," Journal of Monetary Economics, Elsevier, vol. 117(C), pages 507-520.
    5. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2020. "News Media vs. FRED-MD for Macroeconomic Forecasting," CESifo Working Paper Series 8639, CESifo.
    6. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2022. "News media versus FRED‐MD for macroeconomic forecasting," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 37(1), pages 63-81, January.
    7. Lino Wehrheim, 2019. "Economic history goes digital: topic modeling the Journal of Economic History," Cliometrica, Springer;Cliometric Society (Association Francaise de Cliométrie), vol. 13(1), pages 83-125, January.
    8. Juste Raimbault, 2019. "Exploration of an interdisciplinary scientific landscape," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(2), pages 617-641, May.
    9. Peter Grajzl & Peter Murrell, 2021. "Characterizing a legal–intellectual culture: Bacon, Coke, and seventeenth-century England," Cliometrica, Journal of Historical Economics and Econometric History, Association Française de Cliométrie (AFC), vol. 15(1), pages 43-88, January.
    10. Jochen Lüdering & Peter Tillmann, 2016. "Monetary Policy on Twitter and its Effect on Asset Prices: Evidence from Computational Text Analysis," MAGKS Papers on Economics 201612, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    11. Poza, Carlos & Monge, Manuel, 2020. "A real time leading economic indicator based on text mining for the Spanish economy. Fractional cointegration VAR and Continuous Wavelet Transform analysis," International Economics, Elsevier, vol. 163(C), pages 163-175.
    12. Lino Wehrheim, 2017. "Economic History Goes Digital: Topic Modeling the Journal of Economic History," Working Papers 177, Bavarian Graduate Program in Economics (BGPE).
    13. Mohamed M. Mostafa, 2023. "A one-hundred-year structural topic modeling analysis of the knowledge structure of international management research," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(4), pages 3905-3935, August.
    14. Celso Brunetti & Marc Joëts & Valérie Mignon, 2023. "Reasons Behind Words: OPEC Narratives and the Oil Market," Working Papers 2023-19, CEPII research center.
    15. Leif Anders Thorsrud, 2020. "Words are the New Numbers: A Newsy Coincident Index of the Business Cycle," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 38(2), pages 393-409, April.
    16. Hanjo Odendaal & Monique Reid & Johann F. Kirsten, 2020. "Media‐Based Sentiment Indices as an Alternative Measure of Consumer Confidence," South African Journal of Economics, Economic Society of South Africa, vol. 88(4), pages 409-434, December.
    17. Leonardo N. Ferreira, 2021. "Forecasting with VAR-teXt and DFM-teXt Models:exploring the predictive power of central bank communication," Working Papers Series 559, Central Bank of Brazil, Research Department.
    18. Vegard Høghaug Larsen & Leif Anders Thorsrud, 2022. "Asset returns, news topics, and media effects," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(3), pages 838-868, July.
    19. Saskia Ter Ellen & Vegard H. Larsen & Leif Anders Thorsrud, 2022. "Narrative Monetary Policy Surprises and the Media," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 54(5), pages 1525-1549, August.
    20. Vegard H�ghaug Larsen & Leif Anders Thorsrud, 2018. "Business cycle narratives," Working Papers No 6/2018, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.

    More about this item

    Keywords

    Topic Model; R&D; R&I; STI; Innovation; Indicators; Text Mining; Natural Language Processing; NLP;
    All these keywords.

    JEL classification:

    • O30 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - General
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • C83 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Survey Methods; Sampling Methods

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:mar:magkse:201815. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Bernd Hayo (email available below). General contact details of provider: https://edirc.repec.org/data/vamarde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.