IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0226685.html
   My bibliography  Save this article

Measuring the diffusion of innovations with paragraph vector topic models

Author

Listed:
  • David Lenz
  • Peter Winker

Abstract

Measuring the diffusion of innovations from textual data sources besides patent data has not been studied extensively. However, early and accurate indicators of innovation and the recognition of trends in innovation are mandatory to successfully promote economic growth through technological progress via evidence-based policy making. In this study, we propose Paragraph Vector Topic Model (PVTM) and apply it to technology-related news articles to analyze innovation-related topics over time and gain insights regarding their diffusion process. PVTM represents documents in a semantic space, which has been shown to capture latent variables of the underlying documents, e.g., the latent topics. Clusters of documents in the semantic space can then be interpreted and transformed into meaningful topics by means of Gaussian mixture modeling. In using PVTM, we identify innovation-related topics from 170, 000 technology news articles published over a span of 20 years and gather insights about their diffusion state by measuring the topic importance in the corpus over time. Our results suggest that PVTM is a credible alternative to widely used topic models for the discovery of latent topics in (technology-related) news articles. An examination of three exemplary topics shows that innovation diffusion could be assessed using topic importance measures derived from PVTM. Thereby, we find that PVTM diffusion indicators for certain topics are Granger causal to Google Trend indices with matching search terms.

Suggested Citation

  • David Lenz & Peter Winker, 2020. "Measuring the diffusion of innovations with paragraph vector topic models," PLOS ONE, Public Library of Science, vol. 15(1), pages 1-18, January.
  • Handle: RePEc:plo:pone00:0226685
    DOI: 10.1371/journal.pone.0226685
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0226685
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0226685&type=printable
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Lüdering Jochen & Winker Peter, 2016. "Forward or Backward Looking? The Economic Discourse and the Observed Reality," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(4), pages 483-515, August.
    2. Bryan Kelly & Dimitris Papanikolaou & Amit Seru & Matt Taddy, 2018. "Measuring Technological Innovation over the Long Run," NBER Working Papers 25266, National Bureau of Economic Research, Inc.
    3. Stephen Hansen & Michael McMahon & Andrea Prat, 2018. "Transparency and Deliberation Within the FOMC: A Computational Linguistics Approach," The Quarterly Journal of Economics, Oxford University Press, vol. 133(2), pages 801-870.
    4. Leah G. Nichols, 2014. "A topic model approach to measuring interdisciplinarity at the National Science Foundation," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(3), pages 741-754, September.
    5. Hyunyoung Choi & Hal Varian, 2012. "Predicting the Present with Google Trends," The Economic Record, The Economic Society of Australia, vol. 88(s1), pages 2-9, June.
    6. Stathoulopoulos, Kostas & Mateos-Garcia, Juan, 2017. "Mapping without a map: Exploring the UK business landscape using unsupervised learning," SocArXiv ryxdk, Center for Open Science.
    7. David Chavalarias & Jean-Philippe Cointet, 2013. "Phylomemetic Patterns in Science Evolution—The Rise and Fall of Scientific Fields," PLOS ONE, Public Library of Science, vol. 8(2), pages 1-11, February.
    8. Ryohei Hisano & Didier Sornette & Takayuki Mizuno & Takaaki Ohnishi & Tsutomu Watanabe, 2013. "High Quality Topic Extraction from Business News Explains Abnormal Financial Market Volatility," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-12, June.
    9. Antonin Bergeaud & Yoann Potiron & Juste Raimbault, 2017. "Classifying patents based on their semantic content," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-22, April.
    10. Larsen, Vegard H. & Thorsrud, Leif A., 2019. "The value of news for economic developments," Journal of Econometrics, Elsevier, vol. 210(1), pages 203-218.
    11. Vegard H. Larsen & Leif Anders Thorsrud, 2015. "The Value of News," Working Papers No 6/2015, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    12. Kilian,Lutz & Lütkepohl,Helmut, 2018. "Structural Vector Autoregressive Analysis," Cambridge Books, Cambridge University Press, number 9781107196575, December.
    13. Ryohei Hisano & Didier Sornette & Takayuki Mizuno & Takaaki Ohnishi & Tsutomu Watanabe, 2012. "High quality topic extraction from business news explains abnormal financial market volatility," Papers 1210.6321, arXiv.org, revised Mar 2013.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Janna Axenbeck & Patrick Breithaupt, 2021. "Innovation indicators based on firm websites—Which website characteristics predict firm-level innovation activity?," PLOS ONE, Public Library of Science, vol. 16(4), pages 1-23, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Larsen, Vegard H. & Thorsrud, Leif Anders & Zhulanova, Julia, 2021. "News-driven inflation expectations and information rigidities," Journal of Monetary Economics, Elsevier, vol. 117(C), pages 507-520.
    2. Vegard Høghaug Larsen & Leif Anders Thorsrud, 2018. "Business cycle narratives," Working Papers No 6/2018, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    3. Juste Raimbault, 2019. "Exploration of an interdisciplinary scientific landscape," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(2), pages 617-641, May.
    4. Jochen Lüdering & Peter Tillmann, 2016. "Monetary Policy on Twitter and its Effect on Asset Prices: Evidence from Computational Text Analysis," MAGKS Papers on Economics 201612, Philipps-Universität Marburg, Faculty of Business Administration and Economics, Department of Economics (Volkswirtschaftliche Abteilung).
    5. Poza, Carlos & Monge, Manuel, 2020. "A real time leading economic indicator based on text mining for the Spanish economy. Fractional cointegration VAR and Continuous Wavelet Transform analysis," International Economics, Elsevier, vol. 163(C), pages 163-175.
    6. Lino Wehrheim, 2017. "Economic History Goes Digital: Topic Modeling the Journal of Economic History," Working Papers 177, Bavarian Graduate Program in Economics (BGPE).
    7. Shinya Kawata & Yoshi Fujiwara, 2016. "Constructing of network from topics and their temporal change in the Nikkei newspaper articles," Evolutionary and Institutional Economics Review, Springer, vol. 13(2), pages 423-436, December.
    8. Leif Anders Thorsrud, 2016. "Nowcasting using news topics Big Data versus big bank," Working Papers No 6/2016, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    9. Leif Anders Thorsrud, 2020. "Words are the New Numbers: A Newsy Coincident Index of the Business Cycle," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 38(2), pages 393-409, April.
    10. Vegard H. Larsen & Leif Anders Thorsrud, 2017. "Asset returns, news topics, and media effects," Working Paper 2017/17, Norges Bank.
    11. Lüdering Jochen & Winker Peter, 2016. "Forward or Backward Looking? The Economic Discourse and the Observed Reality," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(4), pages 483-515, August.
    12. Melody Y. Huang & Randall R. Rojas & Patrick D. Convery, 0. "Forecasting stock market movements using Google Trend searches," Empirical Economics, Springer, vol. 0, pages 1-19.
    13. Saskia ter Ellen & Vegard H. Larsen & Leif Anders Thorsrud, 2019. "Narrative monetary policy surprises and the media," Working Papers No 06/2019, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    14. Nancy Kong & Uwe Dulleck & Adam B. Jaffe & Shupeng Sun & Sowmya Vajjala, 2020. "Linguistic Metrics for Patent Disclosure: Evidence from University Versus Corporate Patents," NBER Working Papers 27803, National Bureau of Economic Research, Inc.
    15. Aghion, Philippe & Bergeaud, Antonin & Van Reenen, John, 2021. "The Impact of Regulation on Innovation," IZA Discussion Papers 14082, Institute of Labor Economics (IZA).
    16. Felix Kapfhammer & Vegard H. Larsen & Leif Anders Thorsrud, 2020. "Climate Risk and Commodity Currencies," Working Papers No 10/2020, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    17. Thomas J Hwang, 2013. "Stock Market Returns and Clinical Trial Results of Investigational Compounds: An Event Study Analysis of Large Biopharmaceutical Companies," PLOS ONE, Public Library of Science, vol. 8(8), pages 1-8, August.
    18. Yoshifumi Tahira & Takayuki Mizuno, 2016. "Trading strategy of a stock index based on the frequency of news releases for listed companies," Evolutionary and Institutional Economics Review, Springer, vol. 13(2), pages 437-444, December.
    19. Larsen, Vegard H. & Thorsrud, Leif A., 2019. "The value of news for economic developments," Journal of Econometrics, Elsevier, vol. 210(1), pages 203-218.
    20. Matthew Gentzkow & Bryan T. Kelly & Matt Taddy, 2017. "Text as Data," NBER Working Papers 23276, National Bureau of Economic Research, Inc.

    More about this item

    JEL classification:

    • O30 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - General
    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • C83 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Survey Methods; Sampling Methods

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0226685. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (plosone). General contact details of provider: https://journals.plos.org/plosone/ .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.