IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0249071.html

Predicting innovative firms using web mining and deep learning

Author

Listed:
  • Jan Kinne
  • David Lenz

Abstract

Evidence-based STI (science, technology, and innovation) policy making requires accurate indicators of innovation in order to promote economic growth. However, traditional indicators from patents and questionnaire-based surveys often lack coverage, granularity as well as timeliness and may involve high data collection costs, especially when conducted at a large scale. Consequently, they struggle to provide policy makers and scientists with the full picture of the current state of the innovation system. In this paper, we propose a first approach on generating web-based innovation indicators which may have the potential to overcome some of the shortcomings of traditional indicators. Specifically, we develop a method to identify product innovator firms at a large scale and very low costs. We use traditional firm-level indicators from a questionnaire-based innovation survey (German Community Innovation Survey) to train an artificial neural network classification model on labelled (product innovator/no product innovator) web texts of surveyed firms. Subsequently, we apply this classification model to the web texts of hundreds of thousands of firms in Germany to predict whether they are product innovators or not. We then compare these predictions to firm-level patent statistics, survey extrapolation benchmark data, and regional innovation indicators. The results show that our approach produces reliable predictions and has the potential to be a valuable and highly cost-efficient addition to the existing set of innovation indicators, especially due to its coverage and regional granularity.

Suggested Citation

  • Jan Kinne & David Lenz, 2021. "Predicting innovative firms using web mining and deep learning," PLOS ONE, Public Library of Science, vol. 16(4), pages 1-18, April.
  • Handle: RePEc:plo:pone00:0249071
    DOI: 10.1371/journal.pone.0249071
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0249071
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0249071&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0249071?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Matthew Gentzkow & Bryan T. Kelly & Matt Taddy, 2017. "Text as Data," NBER Working Papers 23276, National Bureau of Economic Research, Inc.
    2. Vegard H. Larsen & Leif Anders Thorsrud, 2015. "The Value of News," Working Papers No 6/2015, Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.
    3. Nagaoka, Sadao & Motohashi, Kazuyuki & Goto, Akira, 2010. "Patent Statistics as an Innovation Indicator," Handbook of the Economics of Innovation, in: Bronwyn H. Hall & Nathan Rosenberg (ed.), Handbook of the Economics of Innovation, edition 1, volume 2, chapter 0, pages 1083-1127, Elsevier.
    4. Bettina Peters, 2009. "Persistence of innovation: stylised facts and panel data evidence," The Journal of Technology Transfer, Springer, vol. 34(2), pages 226-243, April.
    5. repec:fth:harver:1473 is not listed on IDEAS
    6. Zvi Griliches, 1998. "Patent Statistics as Economic Indicators: A Survey," NBER Chapters, in: R&D and Productivity: The Econometric Evidence, pages 287-343, National Bureau of Economic Research, Inc.
    7. Bronwyn H. Hall & Nathan Rosenberg (ed.), 2010. "Handbook of the Economics of Innovation," Handbook of the Economics of Innovation, Elsevier, edition 1, volume 1, number 1.
    8. Lüdering Jochen & Winker Peter, 2016. "Forward or Backward Looking? The Economic Discourse and the Observed Reality," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(4), pages 483-515, August.
    9. Zoltan J. Acs & Luc Anselin & Attila Varga, 2008. "Patents and Innovation Counts as Measures of Regional Production of New Knowledge," Chapters, in: Entrepreneurship, Growth and Public Policy, chapter 11, pages 135-151, Edward Elgar Publishing.
    10. Fred Gault (ed.), 2013. "Handbook of Innovation Indicators and Measurement," Books, Edward Elgar Publishing, number 14427, June.
    11. Kinne, Jan & Axenbeck, Janna, 2018. "Web mining of firm websites: A framework for web scraping and a pilot study for Germany," ZEW Discussion Papers 18-033, ZEW - Leibniz Centre for European Economic Research.
    12. Bersch, Johannes & Gottschalk, Sandra & Müller, Bettina & Niefert, Michaela, 2014. "The Mannheim Enterprise Panel (MUP) and firm statistics for Germany," ZEW Discussion Papers 14-104, ZEW - Leibniz Centre for European Economic Research.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jahn, Vera & Berlemann, Michael, 2014. "Governance, Firm Size and Innovative Capacity: Regional Empirical Evidence for Germany," VfS Annual Conference 2014 (Hamburg): Evidence-based Economic Policy 100412, Verein für Socialpolitik / German Economic Association.
    2. Behrens, Vanessa & Berger, Marius & Hud, Martin & Hünermund, Paul & Iferd, Younes & Peters, Bettina & Rammer, Christian & Schubert, Torben, 2017. "Innovation activities of firms in Germany - Results of the German CIS 2012 and 2014: Background report on the surveys of the Mannheim Innovation Panel Conducted in the Years 2013 to 2016," ZEW Dokumentationen 17-04, ZEW - Leibniz Centre for European Economic Research.
    3. Muhammad Athar Nadeem & Zhiying Liu & Haji Suleman Ali & Amna Younis & Muhammad Bilal & Yi Xu, 2020. "Innovation and Sustainable Development: Does Aid and Political Instability Impede Innovation?," SAGE Open, , vol. 10(4), pages 21582440209, November.
    4. Mohnen, Pierre, 2019. "R&D, innovation and productivity," MERIT Working Papers 2019-016, United Nations University - Maastricht Economic and Social Research Institute on Innovation and Technology (MERIT).
    5. Carlino, Gerald & Kerr, William R., 2015. "Agglomeration and Innovation," Handbook of Regional and Urban Economics, in: Gilles Duranton & J. V. Henderson & William C. Strange (ed.), Handbook of Regional and Urban Economics, edition 1, volume 5, chapter 0, pages 349-404, Elsevier.
    6. Jörn Block & Christian Fisch & Kenta Ikeuchi & Masatoshi Kato, 2022. "Trademarks as an indicator of regional innovation: evidence from Japanese prefectures," Regional Studies, Taylor & Francis Journals, vol. 56(2), pages 190-209, February.
    7. Adelheid Holl & Bettina Peters & Christian Rammer, 2023. "Local knowledge spillovers and innovation persistence of firms," Economics of Innovation and New Technology, Taylor & Francis Journals, vol. 32(6), pages 826-850, August.
    8. Martin Kalthaus, 2020. "Knowledge recombination along the technology life cycle," Journal of Evolutionary Economics, Springer, vol. 30(3), pages 643-704, July.
    9. Riccardo Crescenzi & Alexander Jaax, 2017. "Innovation in Russia: The Territorial Dimension," Economic Geography, Taylor & Francis Journals, vol. 93(1), pages 66-88, January.
    10. Marina Flamand, 2016. "Studying strategic choices of carmakers in the development of energy storage solutions: a patent analysis," International Journal of Automotive Technology and Management, Inderscience Enterprises Ltd, vol. 16(2), pages 169-192.
    11. Kang, Byeongwoo, 2014. "The innovation process of a privately-owned enterprise and a state-owned enterprise in China," IDE Discussion Papers 470, Institute of Developing Economies, Japan External Trade Organization(JETRO).
    12. Lino Wehrheim, 2019. "Economic history goes digital: topic modeling the Journal of Economic History," Cliometrica, Springer;Cliometric Society (Association Francaise de Cliométrie), vol. 13(1), pages 83-125, January.
    13. Lei Jin & Keran Duan & Xu Tang, 2018. "What Is the Relationship between Technological Innovation and Energy Consumption? Empirical Analysis Based on Provincial Panel Data from China," Sustainability, MDPI, vol. 10(1), pages 1-13, January.
    14. Pfister, Curdin & Koomen, Miriam & Harhoff, Dietmar & Backes-Gellner, Uschi, 2021. "Regional innovation effects of applied research institutions," Research Policy, Elsevier, vol. 50(4).
    15. repec:bof:bofrdp:urn:nbn:fi:bof-201512111472 is not listed on IDEAS
    16. Stucki, Tobias & Woerter, Martin, 2019. "The private returns to knowledge: A comparison of ICT, biotechnologies, nanotechnologies, and green technologies," Technological Forecasting and Social Change, Elsevier, vol. 145(C), pages 62-81.
    17. Alessandra Colombelli & Francesco Quatraro, 2014. "The persistence of firms' knowledge base: a quantile approach to Italian data," Economics of Innovation and New Technology, Taylor & Francis Journals, vol. 23(7), pages 585-610, October.
    18. Yin, Hua-Tang & Wen, Jun & Chang, Chun-Ping, 2022. "Science-technology intermediary and innovation in China: Evidence from State Administration for Market Regulation, 2000–2019," Technology in Society, Elsevier, vol. 68(C).
    19. Fritsch, Michael & Wyrwich, Michael, 2021. "Is innovation (increasingly) concentrated in large cities? An international comparison," Research Policy, Elsevier, vol. 50(6).
    20. Sam Tavassoli & Nunzia Carbonara, 2014. "The role of knowledge variety and intensity for regional innovation," Small Business Economics, Springer, vol. 43(2), pages 493-509, August.
    21. Svensson, Roger, 2015. "Measuring Innovation Using Patent Data," Working Paper Series 1067, Research Institute of Industrial Economics.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0249071. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.