IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2505.08706.html
   My bibliography  Save this paper

Big Data and the Computational Social Science of Entrepreneurship and Innovation

Author

Listed:
  • Ningzi Li
  • Shiyang Lai
  • James Evans

Abstract

As large-scale social data explode and machine-learning methods evolve, scholars of entrepreneurship and innovation face new research opportunities but also unique challenges. This chapter discusses the difficulties of leveraging large-scale data to identify technological and commercial novelty, document new venture origins, and forecast competition between new technologies and commercial forms. It suggests how scholars can take advantage of new text, network, image, audio, and video data in two distinct ways that advance innovation and entrepreneurship research. First, machine-learning models, combined with large-scale data, enable the construction of precision measurements that function as system-level observatories of innovation and entrepreneurship across human societies. Second, new artificial intelligence models fueled by big data generate 'digital doubles' of technology and business, forming laboratories for virtual experimentation about innovation and entrepreneurship processes and policies. The chapter argues for the advancement of theory development and testing in entrepreneurship and innovation by coupling big data with big models.

Suggested Citation

  • Ningzi Li & Shiyang Lai & James Evans, 2025. "Big Data and the Computational Social Science of Entrepreneurship and Innovation," Papers 2505.08706, arXiv.org.
  • Handle: RePEc:arx:papers:2505.08706
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2505.08706
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Woodruff, Christopher & Zenteno, Rene, 2007. "Migration networks and microenterprises in Mexico," Journal of Development Economics, Elsevier, vol. 82(2), pages 509-528, March.
    2. Hasan, Sharique & Koning, Rembrand, 2019. "Conversations and idea generation: Evidence from a field experiment," Research Policy, Elsevier, vol. 48(9), pages 1-1.
    3. Prithwiraj Choudhury & Dan Wang & Natalie A. Carlson & Tarun Khanna, 2019. "Machine learning approaches to facial and text analysis: Discovering CEO oral communication styles," Strategic Management Journal, Wiley Blackwell, vol. 40(11), pages 1705-1732, November.
    4. Dennis W. Campbell & Ruidi Shang, 2022. "Tone at the Bottom: Measuring Corporate Misconduct Risk from the Text of Employee Reviews," Management Science, INFORMS, vol. 68(9), pages 7034-7053, September.
    5. Jorge Guzman & Aishen Li, 2023. "Measuring Founding Strategy," Management Science, INFORMS, vol. 69(1), pages 101-118, January.
    6. Aghion, Philippe & Howitt, Peter, 1992. "A Model of Growth through Creative Destruction," Econometrica, Econometric Society, vol. 60(2), pages 323-351, March.
    7. Wang, Jian & Veugelers, Reinhilde & Stephan, Paula, 2017. "Bias against novelty in science: A cautionary tale for users of bibliometric indicators," Research Policy, Elsevier, vol. 46(8), pages 1416-1436.
    8. Vahe Tshitoyan & John Dagdelen & Leigh Weston & Alexander Dunn & Ziqin Rong & Olga Kononova & Kristin A. Persson & Gerbrand Ceder & Anubhav Jain, 2019. "Unsupervised word embeddings capture latent knowledge from materials science literature," Nature, Nature, vol. 571(7763), pages 95-98, July.
    9. Michael Park & Erin Leahey & Russell J. Funk, 2023. "Papers and patents are becoming less disruptive over time," Nature, Nature, vol. 613(7942), pages 138-144, January.
    10. Johannes M. Pennings & Farid Harianto, 1992. "Technological Networking and Innovation Implementation," Organization Science, INFORMS, vol. 3(3), pages 356-382, August.
    11. Gustaf Bellstam & Sanjai Bhagat & J. Anthony Cookson, 2021. "A Text-Based Analysis of Corporate Innovation," Management Science, INFORMS, vol. 67(7), pages 4004-4031, July.
    12. Shinichi Kamiya & Y. Han (Andy) Kim & Soohyun Park, 2019. "The face of risk: CEO facial masculinity and firm risk," European Financial Management, European Financial Management Association, vol. 25(2), pages 239-270, March.
    13. Jochen Christian Werth & Patrick Boeert, 2013. "Co-investment networks of business angels and the performance of their start-up investments," International Journal of Entrepreneurial Venturing, Inderscience Enterprises Ltd, vol. 5(3), pages 240-256.
    14. Peter Thompson & Melanie Fox-Kean, 2005. "Patent Citations and the Geography of Knowledge Spillovers: A Reassessment: Reply," American Economic Review, American Economic Association, vol. 95(1), pages 465-466, March.
    15. Shaobo Li & Jie Hu & Yuxin Cui & Jianjun Hu, 2018. "DeepPatent: patent classification with convolutional neural networks and word embedding," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(2), pages 721-744, November.
    16. Jamshid Sourati & James A. Evans, 2023. "Accelerating science with human-aware artificial intelligence," Nature Human Behaviour, Nature, vol. 7(10), pages 1682-1696, October.
    17. Karl Taeuscher & Eric Yanfei Zhao & Michael Lounsbury, 2022. "Categories and narratives as sources of distinctiveness: Cultural entrepreneurship within and across categories," Strategic Management Journal, Wiley Blackwell, vol. 43(10), pages 2101-2134, October.
    18. Lindell Bromham & Russell Dinnage & Xia Hua, 2016. "Interdisciplinary research has consistently lower funding success," Nature, Nature, vol. 534(7609), pages 684-687, June.
    19. Juan Bu & Eric Yanfei Zhao & Krista J. Li & Joanna Mingxuan Li, 2022. "Multilevel optimal distinctiveness: Examining the impact of within‐ and between‐organization distinctiveness of product design on market performance," Strategic Management Journal, Wiley Blackwell, vol. 43(9), pages 1793-1822, September.
    20. Peter Thompson & Melanie Fox-Kean, 2005. "Patent Citations and the Geography of Knowledge Spillovers: A Reassessment," American Economic Review, American Economic Association, vol. 95(1), pages 450-460, March.
    21. Fleming, Lee & Sorenson, Olav, 2001. "Technology as a complex adaptive system: evidence from patent data," Research Policy, Elsevier, vol. 30(7), pages 1019-1039, August.
    22. Feng Shi & James Evans, 2023. "Surprising combinations of research contents and contexts are related to impact and emerge with scientific outsiders from distant disciplines," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    23. Yanto Chandra & Li Crystal Jiang & Cheng-Jun Wang, 2016. "Mining Social Entrepreneurship Strategies Using Topic Modeling," PLOS ONE, Public Library of Science, vol. 11(3), pages 1-28, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hain, Daniel S. & Jurowetzki, Roman & Buchmann, Tobias & Wolf, Patrick, 2022. "A text-embedding-based approach to measuring patent-to-patent technological similarity," Technological Forecasting and Social Change, Elsevier, vol. 177(C).
    2. Marta Aloi & Joanna Poyago-Theotoky & Frédéric Tournemaine, 2022. "The Geography of Knowledge and R&D-led Growth [Real effects ofacademic research: comment]," Journal of Economic Geography, Oxford University Press, vol. 22(6), pages 1149-1190.
    3. Carlino, Gerald & Kerr, William R., 2015. "Agglomeration and Innovation," Handbook of Regional and Urban Economics, in: Gilles Duranton & J. V. Henderson & William C. Strange (ed.), Handbook of Regional and Urban Economics, edition 1, volume 5, chapter 0, pages 349-404, Elsevier.
    4. Mori, Tomoya & Sakaguchi, Shosei, 2018. "Collaborative knowledge creation: Evidence from Japanese patent data," MPRA Paper 88716, University Library of Munich, Germany.
    5. Diemer, Andreas & Regan, Tanner, 2022. "No inventor is an island: Social connectedness and the geography of knowledge flows in the US," Research Policy, Elsevier, vol. 51(2).
    6. Duranton, Gilles & Puga, Diego, 2014. "The Growth of Cities," Handbook of Economic Growth, in: Philippe Aghion & Steven Durlauf (ed.), Handbook of Economic Growth, edition 1, volume 2, chapter 5, pages 781-853, Elsevier.
    7. Zoltán J. Ács & Pontus Braunerhjelm & David B. Audretsch & Bo Carlsson, 2015. "The knowledge spillover theory of entrepreneurship," Chapters, in: Global Entrepreneurship, Institutions and Incentives, chapter 7, pages 129-144, Edward Elgar Publishing.
    8. Sam Arts & Nicola Melluso & Reinhilde Veugelers, 2023. "Beyond Citations: Measuring Novel Scientific Ideas and their Impact in Publication Text," Papers 2309.16437, arXiv.org, revised Dec 2024.
    9. Castillo, Victoria & Figal-Garone, Lucas & Maffioli, Alessandro & Rojo, Sofia & Stucchi, Rodolfo, 2016. "The Effects of Knowledge Spillovers through Labor Mobility," MPRA Paper 69141, University Library of Munich, Germany.
    10. Nathan Goldschlag & Elisabeth Perlman, 2017. "Business Dynamic Statistics of Innovative Firms," Working Papers 17-72, Center for Economic Studies, U.S. Census Bureau.
    11. Li, Meiling & Wang, Yang & Du, Haifeng & Bai, Aruhan, 2024. "Motivating innovation: The impact of prestigious talent funding on junior scientists," Research Policy, Elsevier, vol. 53(9).
    12. David H. Hsu & Kwanghui Lim, 2014. "Knowledge Brokering and Organizational Innovation: Founder Imprinting Effects," Organization Science, INFORMS, vol. 25(4), pages 1134-1153, August.
    13. Baaden, Philipp & Rennings, Michael & John, Marcus & Bröring, Stefanie, 2024. "On the emergence of interdisciplinary scientific fields: (how) does it relate to science convergence?," Research Policy, Elsevier, vol. 53(6).
    14. Ufuk Akcigit & William R. Kerr, 2018. "Growth through Heterogeneous Innovations," Journal of Political Economy, University of Chicago Press, vol. 126(4), pages 1374-1443.
    15. Tomoya Mori & Shosei Sakaguchi, 2019. "Creation of knowledge through exchanges of knowledge: Evidence from Japanese patent data," Papers 1908.01256, arXiv.org, revised Aug 2020.
    16. Pierre Azoulay & Joshua S. Graff Zivin & Bhaven N. Sampat, 2011. "The Diffusion of Scientific Knowledge across Time and Space: Evidence from Professional Transitions for the Superstars of Medicine," NBER Chapters, in: The Rate and Direction of Inventive Activity Revisited, pages 107-155, National Bureau of Economic Research, Inc.
    17. Barbieri, Nicolò & Marzucchi, Alberto & Rizzo, Ugo, 2020. "Knowledge sources and impacts on subsequent inventions: Do green technologies differ from non-green ones?," Research Policy, Elsevier, vol. 49(2).
    18. Dechezlepretre, Antoine & Martin, Ralf & Mohnen, Myra, 2014. "Knowledge spillovers from clean and dirty technologies," LSE Research Online Documents on Economics 60501, London School of Economics and Political Science, LSE Library.
    19. repec:bof:bofrdp:urn:nbn:fi:bof-201512111472 is not listed on IDEAS
    20. Gao, Wenlian & Chou, Julia, 2015. "Innovation efficiency, global diversification, and firm value," Journal of Corporate Finance, Elsevier, vol. 30(C), pages 278-298.
    21. Adam Whittle, 2017. "Local and Non-Local Knowledge Typologies: Technological Complexity in the Irish Knowledge Space," Papers in Evolutionary Economic Geography (PEEG) 1728, Utrecht University, Department of Human Geography and Spatial Planning, Group Economic Geography, revised Nov 2017.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2505.08706. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.