IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v129y2024i12d10.1007_s11192-024-05196-9.html
   My bibliography  Save this article

Predicting scholar potential: a deep learning model on social capital features

Author

Listed:
  • Dehu Yin

    (Tianjin University)

  • Xi Zhang

    (Tianjin University)

  • Hongke Zhao

    (Tianjin University)

  • Li Tang

    (Fudan University)

Abstract

Identifying scholars with potentials early in their careers is critical for informed evaluations, effective allocation of funding, and tenure decisions, which in turn propel advancements in science and technology. This paper investigates the impact of social capital features on the identification of such scholars. Utilizing a comprehensive dataset spanning from 1991 to 2020, extracted from the Microsoft Academic Knowledge Graph, we analyze the novelty values of 56,568 scholars’ future publications using disruption index. We identify potential scholars as those within the top 1% based on these values. Our approach involves extracting nine key features of structural, relational, and cognitive capital from the dynamic co-authorship networks of these scholars during their early career stages. The influence of these features on scholar identification is assessed through ablation experiments using an LSTM-based predictive model. Our findings underscore the critical importance of cognitive capital features in the identification process. Furthermore, the integration of structural and relational capital features markedly enhances the model’s predictive accuracy, achieving significant improvements in precision metrics. Notably, relational capital features demonstrate a greater influence than structural features in predicting scholar potentials. These results provide essential insights and practical implications for strategies aimed at recognizing and fostering outstanding academic talent.

Suggested Citation

  • Dehu Yin & Xi Zhang & Hongke Zhao & Li Tang, 2024. "Predicting scholar potential: a deep learning model on social capital features," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(12), pages 7851-7879, December.
  • Handle: RePEc:spr:scient:v:129:y:2024:i:12:d:10.1007_s11192-024-05196-9
    DOI: 10.1007/s11192-024-05196-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-024-05196-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-024-05196-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zhao, Qihang & Feng, Xiaodong, 2022. "Utilizing citation network structure to predict paper citation counts: A Deep learning approach," Journal of Informetrics, Elsevier, vol. 16(1).
    2. Bordons, María & Aparicio, Javier & González-Albo, Borja & Díaz-Faes, Adrián A., 2015. "The relationship between the research performance of scientists and their position in co-authorship networks in three fields," Journal of Informetrics, Elsevier, vol. 9(1), pages 135-144.
    3. Bryan Kelly & Dimitris Papanikolaou & Amit Seru & Matt Taddy, 2021. "Measuring Technological Innovation over the Long Run," American Economic Review: Insights, American Economic Association, vol. 3(3), pages 303-320, September.
    4. Stegehuis, Clara & Litvak, Nelly & Waltman, Ludo, 2015. "Predicting the long-term citation impact of recent publications," Journal of Informetrics, Elsevier, vol. 9(3), pages 642-657.
    5. Xuli Tang & Xin Li & Feicheng Ma, 2022. "Internationalizing AI: evolution and impact of distance factors," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(1), pages 181-205, January.
    6. Ruan, Xuanmin & Zhu, Yuanyang & Li, Jiang & Cheng, Ying, 2020. "Predicting the citation counts of individual papers via a BP neural network," Journal of Informetrics, Elsevier, vol. 14(3).
    7. Trapido, Denis, 2015. "How novelty in knowledge earns recognition: The role of consistent identities," Research Policy, Elsevier, vol. 44(8), pages 1488-1500.
    8. Tobias Mistele & Tom Price & Sabine Hossenfelder, 2019. "Predicting authors’ citation counts and h-indices with a neural network," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(1), pages 87-104, July.
    9. Uddin, Shahadat & Khan, Arif, 2016. "The impact of author-selected keywords on citation counts," Journal of Informetrics, Elsevier, vol. 10(4), pages 1166-1177.
    10. Bornmann, Lutz & Williams, Richard, 2017. "Can the journal impact factor be used as a criterion for the selection of junior researchers? A large-scale empirical study based on ResearcherID data," Journal of Informetrics, Elsevier, vol. 11(3), pages 788-799.
    11. Wang, Jian & Veugelers, Reinhilde & Stephan, Paula, 2017. "Bias against novelty in science: A cautionary tale for users of bibliometric indicators," Research Policy, Elsevier, vol. 46(8), pages 1416-1436.
    12. Yuhao Zhou & Ruijie Wang & An Zeng, 2022. "Predicting the impact and publication date of individual scientists’ future papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 1867-1882, April.
    13. Russell J. Funk & Jason Owen-Smith, 2017. "A Dynamic Network Measure of Technological Change," Management Science, INFORMS, vol. 63(3), pages 791-817, March.
    14. Abramo, Giovanni & D’Angelo, Ciriaco Andrea & Felici, Giovanni, 2019. "Predicting publication long-term impact through a combination of early citations and journal impact factor," Journal of Informetrics, Elsevier, vol. 13(1), pages 32-49.
    15. Matthew E Falagas & Angeliki Zarkali & Drosos E Karageorgopoulos & Vangelis Bardakas & Michael N Mavros, 2013. "The Impact of Article Length on the Number of Future Citations: A Bibliometric Analysis of General Medicine Journals," PLOS ONE, Public Library of Science, vol. 8(2), pages 1-8, February.
    16. Seyyed Reza Taher Harikandeh & Sadegh Aliakbary & Soroush Taheri, 2023. "An embedding approach for analyzing the evolution of research topics with a case study on computer science subdomains," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(3), pages 1567-1582, March.
    17. An Zeng & Zhesi Shen & Jianlin Zhou & Ying Fan & Zengru Di & Yougui Wang & H. Eugene Stanley & Shlomo Havlin, 2019. "Increasing trend of scientists to switch between topics," Nature Communications, Nature, vol. 10(1), pages 1-11, December.
    18. Lingfei Wu & Dashun Wang & James A. Evans, 2019. "Large teams develop and small teams disrupt science and technology," Nature, Nature, vol. 566(7744), pages 378-382, February.
    19. Lutz Bornmann & Sitaram Devarakonda & Alexander Tekles & George Chacko, 2020. "Disruptive papers published in Scientometrics: meaningful results by using an improved variant of the disruption index originally proposed by Wu, Wang, and Evans (2019)," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(2), pages 1149-1155, May.
    20. Li, Eldon Y. & Liao, Chien Hsiang & Yen, Hsiuju Rebecca, 2013. "Co-authorship networks and research impact: A social capital perspective," Research Policy, Elsevier, vol. 42(9), pages 1515-1530.
    21. Daniel E. Acuna & Stefano Allesina & Konrad P. Kording, 2012. "Predicting scientific success," Nature, Nature, vol. 489(7415), pages 201-202, September.
    22. Carbonneau, Real & Laframboise, Kevin & Vahidov, Rustam, 2008. "Application of machine learning techniques for supply chain demand forecasting," European Journal of Operational Research, Elsevier, vol. 184(3), pages 1140-1154, February.
    23. Morgan R. Frank & David Autor & James E. Bessen & Erik Brynjolfsson & Manuel Cebrian & David J. Deming & Maryann Feldman & Matthew Groh & José Lobo & Esteban Moro & Dashun Wang & Hyejin Youn & Iyad Ra, 2019. "Toward understanding the impact of artificial intelligence on labor," Proceedings of the National Academy of Sciences, Proceedings of the National Academy of Sciences, vol. 116(14), pages 6531-6539, April.
    24. Heejung Byun & Justin Frake & Rajshree Agarwal, 2018. "Leveraging who you know by what you know: Specialization and returns to relational capital," Strategic Management Journal, Wiley Blackwell, vol. 39(7), pages 1803-1833, July.
    25. An Zeng & Ying Fan & Zengru Di & Yougui Wang & Shlomo Havlin, 2021. "Fresh teams are associated with original and multidisciplinary research," Nature Human Behaviour, Nature, vol. 5(10), pages 1314-1322, October.
    26. Fischer, Thomas & Krauss, Christopher, 2018. "Deep learning with long short-term memory networks for financial market predictions," European Journal of Operational Research, Elsevier, vol. 270(2), pages 654-669.
    27. Yang Wang & Benjamin F. Jones & Dashun Wang, 2019. "Early-career setback and future career impact," Nature Communications, Nature, vol. 10(1), pages 1-10, December.
    28. Weihua Li & Tomaso Aste & Fabio Caccioli & Giacomo Livan, 2019. "Early coauthorship with top scientists predicts success in academic careers," Nature Communications, Nature, vol. 10(1), pages 1-9, December.
    29. Youtie, Jan & Rogers, Juan & Heinze, Thomas & Shapira, Philip & Tang, Li, 2013. "Career-based influences on scientific recognition in the United States and Europe: Longitudinal evidence from curriculum vitae data," Research Policy, Elsevier, vol. 42(8), pages 1341-1355.
    30. Soroush Taheri & Sadegh Aliakbary, 2022. "Research trend prediction in computer science publications: a deep neural network approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(2), pages 849-869, February.
    31. Danielle H. Lee, 2019. "Predicting the research performance of early career scientists," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(3), pages 1481-1504, December.
    32. Tian Yu & Guang Yu & Peng-Yu Li & Liang Wang, 2014. "Citation impact prediction for scientific papers using stepwise regression analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1233-1252, November.
    33. Xi Zhang & Xianhai Wang & Hongke Zhao & Patricia Ordóñez de Pablos & Yongqiang Sun & Hui Xiong, 2019. "An effectiveness analysis of altmetrics indices for different levels of artificial intelligence publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1311-1344, June.
    34. Jake M. Hofman & Duncan J. Watts & Susan Athey & Filiz Garip & Thomas L. Griffiths & Jon Kleinberg & Helen Margetts & Sendhil Mullainathan & Matthew J. Salganik & Simine Vazire & Alessandro Vespignani, 2021. "Integrating explanation and prediction in computational social science," Nature, Nature, vol. 595(7866), pages 181-188, July.
    35. Lindahl, Jonas, 2018. "Predicting research excellence at the individual level: The importance of publication rate, top journal publications, and top 10% publications in the case of early career mathematicians," Journal of Informetrics, Elsevier, vol. 12(2), pages 518-533.
    36. Li Hou & Qiang Wu & Yundong Xie, 2022. "Does early publishing in top journals really predict long-term scientific success in the business field?," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6083-6107, November.
    37. Hu, Ya-Han & Tai, Chun-Tien & Liu, Kang Ernest & Cai, Cheng-Fang, 2020. "Identification of highly-cited papers using topic-model-based and bibliometric features: the consideration of keyword popularity," Journal of Informetrics, Elsevier, vol. 14(1).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Danielle Lee, 2024. "Exploring the determinants of research performance for early-career researchers: a literature review," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(1), pages 181-235, January.
    2. Wanjun Xia & Tianrui Li & Chongshou Li, 2023. "A review of scientific impact prediction: tasks, features and methods," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 543-585, January.
    3. Lu Liu & Benjamin F. Jones & Brian Uzzi & Dashun Wang, 2023. "Data, measurement and empirical methods in the science of science," Nature Human Behaviour, Nature, vol. 7(7), pages 1046-1058, July.
    4. Li Hou & Qiang Wu & Yundong Xie, 2022. "Does early publishing in top journals really predict long-term scientific success in the business field?," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6083-6107, November.
    5. Yue Wang & Ning Li & Bin Zhang & Qian Huang & Jian Wu & Yang Wang, 2023. "The effect of structural holes on producing novel and disruptive research in physics," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(3), pages 1801-1823, March.
    6. Wan Siti Nur Aiza & Liyana Shuib & Norisma Idris & Nur Baiti Afini Normadhi, 2024. "Features, techniques and evaluation in predicting articles’ citations: a review from years 2010–2023," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(1), pages 1-29, January.
    7. Guo, Liying & Wang, Yang & Li, Meiling, 2024. "Exploration, exploitation and funding success: Evidence from junior scientists supported by the Chinese Young Scientists Fund," Journal of Informetrics, Elsevier, vol. 18(2).
    8. Wumei Du & Zheng Xie & Yiqin Lv, 2021. "Predicting publication productivity for authors: Shallow or deep architecture?," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5855-5879, July.
    9. Mingyue Sun & Tingcan Ma & Lewei Zhou & Mingliang Yue, 2023. "Analysis of the relationships among paper citation and its influencing factors: a Bayesian network-based approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(5), pages 3017-3033, May.
    10. Li, Heyang & Wu, Meijun & Wang, Yougui & Zeng, An, 2022. "Bibliographic coupling networks reveal the advantage of diversification in scientific projects," Journal of Informetrics, Elsevier, vol. 16(3).
    11. Martorell Cunil, Onofre & Otero González, Luis & Durán Santomil, Pablo & Mulet Forteza, Carlos, 2023. "How to accomplish a highly cited paper in the tourism, leisure and hospitality field," Journal of Business Research, Elsevier, vol. 157(C).
    12. Peng, Xianzhe & Xu, Huixin & Shi, Jin, 2024. "Are the bibliometric growth patterns of excellent scholars similar? From the analysis of ACM Fellows," Journal of Informetrics, Elsevier, vol. 18(3).
    13. Batista-Jr, Antônio de Abreu & Gouveia, Fábio Castro & Mena-Chalco, Jesús P., 2021. "Predicting the Q of junior researchers using data from the first years of publication," Journal of Informetrics, Elsevier, vol. 15(2).
    14. Li, Meiling & Wang, Yang & Du, Haifeng & Bai, Aruhan, 2024. "Motivating innovation: The impact of prestigious talent funding on junior scientists," Research Policy, Elsevier, vol. 53(9).
    15. Fang Zhang & Shengli Wu, 2024. "Predicting citation impact of academic papers across research areas using multiple models and early citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(7), pages 4137-4166, July.
    16. Zhongyi Wang & Keying Wang & Jiyue Liu & Jing Huang & Haihua Chen, 2022. "Measuring the innovation of method knowledge elements in scientific literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2803-2827, May.
    17. Yuhao Zhou & Ruijie Wang & An Zeng, 2022. "Predicting the impact and publication date of individual scientists’ future papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 1867-1882, April.
    18. Akella, Akhil Pandey & Alhoori, Hamed & Kondamudi, Pavan Ravikanth & Freeman, Cole & Zhou, Haiming, 2021. "Early indicators of scientific impact: Predicting citations with altmetrics," Journal of Informetrics, Elsevier, vol. 15(2).
    19. Tang, Kun & Li, Baiyang & Zhu, Qiyu & Ma, Lecun, 2024. "Disruptive content, cross agglomeration interaction, and agglomeration replacement: Does cohesion foster strength?," Journal of Informetrics, Elsevier, vol. 18(4).
    20. Anqi Ma & Yu Liu & Xiujuan Xu & Tao Dong, 2021. "A deep-learning based citation count prediction model with paper metadata semantic features," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6803-6823, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:129:y:2024:i:12:d:10.1007_s11192-024-05196-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.