IDEAS home Printed from https://ideas.repec.org/a/eee/tefoso/v184y2022ics0040162522005637.html
   My bibliography  Save this article

Identifying potential breakthrough research: A machine learning method using scientific papers and Twitter data

Author

Listed:
  • Li, Xin
  • Wen, Yang
  • Jiang, Jiaojiao
  • Daim, Tugrul
  • Huang, Lucheng

Abstract

Breakthrough research may signal shifts in science, technology, and innovation systems. Early identification of breakthrough research is important not only for scientists, but also for policy makers and R&D experts in developing R&D strategies and allocating R&D resources. Researchers mostly use scientific papers data to identify potential breakthrough research, but they rarely make use of Twitter data related to scientific research and machine learning methods. Analysis of Twitter data is of great significance for us to understand the public's perception of potential breakthrough research and to identify potential breakthrough research. Machine learning methods can assist us in predicting the trend of events by utilizing prior knowledge and experience. Therefore, this paper proposes a framework for identifying potential breakthrough research using machine learning methods with scientific papers and Twitter data. We select solar cells as a case study to verify the valid and flexible of this framework. In this case, we use machine learning method to discover potential breakthrough research from scientific papers, and we use Twitter data mining to analyze Twitter users' sense of and response to the discovered potential breakthrough research, which aims to achieve a more extensive and diverse assessment of the discovered potential breakthrough research. This paper contributes to identifying potential breakthrough research, as well as understanding the emergence and development of breakthrough research. It will be of interest to R&D experts in the field of solar cell technology.

Suggested Citation

  • Li, Xin & Wen, Yang & Jiang, Jiaojiao & Daim, Tugrul & Huang, Lucheng, 2022. "Identifying potential breakthrough research: A machine learning method using scientific papers and Twitter data," Technological Forecasting and Social Change, Elsevier, vol. 184(C).
  • Handle: RePEc:eee:tefoso:v:184:y:2022:i:c:s0040162522005637
    DOI: 10.1016/j.techfore.2022.122042
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0040162522005637
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.techfore.2022.122042?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Li, Xin & Xie, Qianqian & Jiang, Jiaojiao & Zhou, Yuan & Huang, Lucheng, 2019. "Identifying and monitoring the development trends of emerging technologies using patent analysis and Twitter data mining: The case of perovskite solar cell technology," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 687-705.
    2. Winnink, J.J. & Tijssen, Robert J.W. & van Raan, A.F.J., 2019. "Searching for new breakthroughs in science: How effective are computerised detection algorithms?," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 673-686.
    3. Wang, Jian & Veugelers, Reinhilde & Stephan, Paula, 2017. "Bias against novelty in science: A cautionary tale for users of bibliometric indicators," Research Policy, Elsevier, vol. 46(8), pages 1416-1436.
    4. Hendrik P. van Dalen & Kène Henkens, 1999. "How Influential Are Demography Journals?," Population and Development Review, The Population Council, Inc., vol. 25(2), pages 229-251, June.
    5. Rickard Danell, 2011. "Can the quality of scientific work be predicted using information on the author's track record?," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(1), pages 50-60, January.
    6. Nick Haslam & Lauren Ban & Leah Kaufmann & Stephen Loughnan & Kim Peters & Jennifer Whelan & Sam Wilson, 2008. "What makes an article influential? Predicting impact in social and personality psychology," Scientometrics, Springer;Akadémiai Kiadó, vol. 76(1), pages 169-185, July.
    7. Ronald N. Kostoff, 2007. "The difference between highly and poorly cited medical articles in the journal Lancet," Scientometrics, Springer;Akadémiai Kiadó, vol. 72(3), pages 513-520, September.
    8. Laudel, Grit & Gläser, Jochen, 2014. "Beyond breakthrough research: Epistemic properties of research and their consequences for research funding," Research Policy, Elsevier, vol. 43(7), pages 1204-1216.
    9. Wanying Ding & Chaomei Chen, 2014. "Dynamic topic detection and tracking: A comparison of HDP, C-word, and cocitation methods," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(10), pages 2084-2097, October.
    10. Robert J. W. Tijssen & Martijn S. Visser & Thed N. van Leeuwen, 2002. "Benchmarking international scientific excellence: Are highly cited research papers an appropriate frame of reference?," Scientometrics, Springer;Akadémiai Kiadó, vol. 54(3), pages 381-397, July.
    11. Min, Chao & Bu, Yi & Sun, Jianjun, 2021. "Predicting scientific breakthroughs based on knowledge structure variations," Technological Forecasting and Social Change, Elsevier, vol. 164(C).
    12. Jesper W. Schneider & Rodrigo Costas, 2017. "Identifying potential “breakthrough” publications using refined citation analyses: Three related explorative approaches," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(3), pages 709-723, March.
    13. Rickard Danell, 2011. "Can the quality of scientific work be predicted using information on the author's track record?," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(1), pages 50-60, January.
    14. Fereshteh Didegah & Mike Thelwall, 2013. "Determinants of research citation impact in nanoscience and nanotechnology," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 64(5), pages 1055-1064, May.
    15. Teh, Yee Whye & Jordan, Michael I. & Beal, Matthew J. & Blei, David M., 2006. "Hierarchical Dirichlet Processes," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1566-1581, December.
    16. Marc Julius & Charles E. Berkoff & Alvin E. Strack & Frank Krasovec & A. Douglas Bender, 1977. "A very early warning system for the rapid identification and transfer of new technology," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 28(3), pages 170-174, May.
    17. Holly N. Wolcott & Matthew J. Fouch & Elizabeth R. Hsu & Leo G. DiJoseph & Catherine A. Bernaciak & James G. Corrigan & Duane E. Williams, 2016. "Modeling time-dependent and -independent indicators to facilitate identification of breakthrough research papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(2), pages 807-817, May.
    18. Alan L. Porter & Alisa Kongthon & Jye-Chyi (JC) Lu, 2002. "Research profiling: Improving the literature review," Scientometrics, Springer;Akadémiai Kiadó, vol. 53(3), pages 351-370, March.
    19. Fereshteh Didegah & Mike Thelwall, 2013. "Determinants of research citation impact in nanoscience and nanotechnology," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 64(5), pages 1055-1064, May.
    20. Hendrik P. Van Dalen & Kène Henkens, 2001. "What makes a scientific article influential? The case of demographers," Scientometrics, Springer;Akadémiai Kiadó, vol. 50(3), pages 455-482, March.
    21. Chai, Sen & Menon, Anoop, 2019. "Breakthrough recognition: Bias against novelty and competition for attention," Research Policy, Elsevier, vol. 48(3), pages 733-747.
    22. Ponomarev, Ilya V. & Williams, Duane E. & Hackett, Charles J. & Schnell, Joshua D. & Haak, Laurel L., 2014. "Predicting highly cited papers: A Method for Early Detection of Candidate Breakthroughs," Technological Forecasting and Social Change, Elsevier, vol. 81(C), pages 49-55.
    23. Tian Yu & Guang Yu & Peng-Yu Li & Liang Wang, 2014. "Citation impact prediction for scientific papers using stepwise regression analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1233-1252, November.
    24. Nofer, Michael & Hinz, Oliver, 2015. "Using Twitter to Predict the Stock Market: Where is the Mood Effect?," Publications of Darmstadt Technical University, Institute for Business Studies (BWL) 77140, Darmstadt Technical University, Department of Business Administration, Economics and Law, Institute for Business Studies (BWL).
    25. J. J. Winnink & Robert J. W. Tijssen, 2015. "Early stage identification of breakthroughs at the interface of science and technology: lessons drawn from a landmark publication," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(1), pages 113-134, January.
    26. Porter, Alan L. & Chiavetta, Denise & Newman, Nils C., 2020. "Measuring tech emergence: A contest," Technological Forecasting and Social Change, Elsevier, vol. 159(C).
    27. Ilya V. Ponomarev & Brian K. Lawton & Duane E. Williams & Joshua D. Schnell, 2014. "Breakthrough paper indicator 2.0: can geographical diversity and interdisciplinarity improve the accuracy of outstanding papers prediction?," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(3), pages 755-765, September.
    28. Li, Xin & Xie, Qianqian & Daim, Tugrul & Huang, Lucheng, 2019. "Forecasting technology trends using text mining of the gaps between science and technology: The case of perovskite solar cell technology," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 432-449.
    29. Dag W Aksnes, 2003. "Characteristics of highly cited papers," Research Evaluation, Oxford University Press, vol. 12(3), pages 159-170, December.
    30. Michael Nofer & Oliver Hinz, 2015. "Using Twitter to Predict the Stock Market," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 57(4), pages 229-242, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Javier Jiménez-Cabas & Lizeth Torres & Jorge de J. Lozoya-Santos, 2023. "Twitter Data Mining for the Diagnosis of Leaks in Drinking Water Distribution Networks," Sustainability, MDPI, vol. 15(6), pages 1-16, March.
    2. Betz, Ulrich A.K. & Arora, Loukik & Assal, Reem A. & Azevedo, Hatylas & Baldwin, Jeremy & Becker, Michael S. & Bostock, Stefan & Cheng, Vinton & Egle, Tobias & Ferrari, Nicola & Schneider-Futschik, El, 2023. "Game changers in science and technology - now and beyond," Technological Forecasting and Social Change, Elsevier, vol. 193(C).
    3. Su, Yu-Shan & Huang, Hsini & Daim, Tugrul & Chien, Pan-Wei & Peng, Ru-Ling & Karaman Akgul, Arzu, 2023. "Assessing the technological trajectory of 5G-V2X autonomous driving inventions: Use of patent analysis," Technological Forecasting and Social Change, Elsevier, vol. 196(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tian Yu & Guang Yu & Peng-Yu Li & Liang Wang, 2014. "Citation impact prediction for scientific papers using stepwise regression analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1233-1252, November.
    2. Fenghua Wang & Ying Fan & An Zeng & Zengru Di, 2019. "Can we predict ESI highly cited publications?," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(1), pages 109-125, January.
    3. Min, Chao & Bu, Yi & Sun, Jianjun, 2021. "Predicting scientific breakthroughs based on knowledge structure variations," Technological Forecasting and Social Change, Elsevier, vol. 164(C).
    4. Peter Klimek & Aleksandar Jovanovic & Rainer Egloff & Reto Schneider, 2016. "Successful fish go with the flow: citation impact prediction based on centrality measures for term–document networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1265-1282, June.
    5. Lindahl, Jonas, 2018. "Predicting research excellence at the individual level: The importance of publication rate, top journal publications, and top 10% publications in the case of early career mathematicians," Journal of Informetrics, Elsevier, vol. 12(2), pages 518-533.
    6. Libo Sheng & Dongqing Lyu & Xuanmin Ruan & Hongquan Shen & Ying Cheng, 2023. "The association between prior knowledge and the disruption of an article," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(8), pages 4731-4751, August.
    7. Winnink, J.J. & Tijssen, Robert J.W. & van Raan, A.F.J., 2019. "Searching for new breakthroughs in science: How effective are computerised detection algorithms?," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 673-686.
    8. Mingyang Wang & Zhenyu Wang & Guangsheng Chen, 2019. "Which can better predict the future success of articles? Bibliometric indices or alternative metrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1575-1595, June.
    9. Basma Albanna & Julia Handl & Richard Heeks, 2021. "Publication outperformance among global South researchers: An analysis of individual-level and publication-level predictors of positive deviance," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(10), pages 8375-8431, October.
    10. Didegah, Fereshteh & Thelwall, Mike, 2013. "Which factors help authors produce the highest impact research? Collaboration, journal and document properties," Journal of Informetrics, Elsevier, vol. 7(4), pages 861-873.
    11. Mingyang Wang & Guang Yu & Shuang An & Daren Yu, 2012. "Discovery of factors influencing citation impact based on a soft fuzzy rough set model," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(3), pages 635-644, December.
    12. Shiyun Wang & Yaxue Ma & Jin Mao & Yun Bai & Zhentao Liang & Gang Li, 2023. "Quantifying scientific breakthroughs by a novel disruption indicator based on knowledge entities," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 74(2), pages 150-167, February.
    13. Wanjun Xia & Tianrui Li & Chongshou Li, 2023. "A review of scientific impact prediction: tasks, features and methods," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 543-585, January.
    14. Kaile Gong & Juan Xie & Ying Cheng & Vincent Larivière & Cassidy R. Sugimoto, 2019. "The citation advantage of foreign language references for Chinese social science papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(3), pages 1439-1460, September.
    15. Wang, Mingyang & Yu, Guang & Xu, Jianzhong & He, Huixin & Yu, Daren & An, Shuang, 2012. "Development a case-based classifier for predicting highly cited papers," Journal of Informetrics, Elsevier, vol. 6(4), pages 586-599.
    16. Bornmann, Lutz & Haunschild, Robin & Mutz, Rüdiger, 2020. "Should citations be field-normalized in evaluative bibliometrics? An empirical analysis based on propensity score matching," Journal of Informetrics, Elsevier, vol. 14(4).
    17. Iman Tahamtan & Askar Safipour Afshar & Khadijeh Ahamdzadeh, 2016. "Factors affecting number of citations: a comprehensive review of the literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1195-1225, June.
    18. Stegehuis, Clara & Litvak, Nelly & Waltman, Ludo, 2015. "Predicting the long-term citation impact of recent publications," Journal of Informetrics, Elsevier, vol. 9(3), pages 642-657.
    19. Ruan, Xuanmin & Zhu, Yuanyang & Li, Jiang & Cheng, Ying, 2020. "Predicting the citation counts of individual papers via a BP neural network," Journal of Informetrics, Elsevier, vol. 14(3).
    20. Guoqiang Liang & Haiyan Hou & Xiaodan Lou & Zhigang Hu, 2019. "Qualifying threshold of “take-off” stage for successfully disseminated creative ideas," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(3), pages 1193-1208, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:tefoso:v:184:y:2022:i:c:s0040162522005637. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.sciencedirect.com/science/journal/00401625 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.