IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v126y2021i7d10.1007_s11192-021-04003-z.html
   My bibliography  Save this article

A deep learning approach for identifying biomedical breakthrough discoveries using context analysis

Author

Listed:
  • Xue Wang

    (Chinese Academy of Medical Sciences and Peking Union Medical College)

  • Xuemei Yang

    (Chinese Academy of Medical Sciences and Peking Union Medical College)

  • Jian Du

    (Peking University)

  • Xuwen Wang

    (Chinese Academy of Medical Sciences and Peking Union Medical College)

  • Jiao Li

    (Chinese Academy of Medical Sciences and Peking Union Medical College)

  • Xiaoli Tang

    (Chinese Academy of Medical Sciences and Peking Union Medical College)

Abstract

Breakthrough research in scientific fields usually comes as a manifestation of major development and advancement. These advances build to an epiphany where new ways of thinking about a problem become possible. Identifying breakthrough research can be useful for cultivating and funding further innovation. This article presents a new method for identifying scientific breakthroughs from research papers based on cue words commonly associated with major advancements. We looked for specific terms signifying scientific breakthroughs in citing sentences to identify breakthrough articles. By setting a threshold for the number of citing sentences (“citances”) with breakthrough cue words that peer scholars often use when evaluating research, we identified articles containing breakthrough research. We call this approach the “others-evaluation” process. We then shortlisted candidates from the selected articles based on the authors’ evaluations of their own research, found in the abstracts. This we call the “self-evaluation” process. Combining the two approaches into a dual “others-self” evaluation process, we arrived at a sample of 237 potential breakthrough articles, most of which are recommended by the Faculty Opinions. Based on the breakthrough articles identified, using SVM, TextCNN, and BERT to train the models to identify abstracts with breakthrough evaluations. This automatic identification model can greatly simplify the process of others-self-evaluation process and promote identifying breakthrough research.

Suggested Citation

  • Xue Wang & Xuemei Yang & Jian Du & Xuwen Wang & Jiao Li & Xiaoli Tang, 2021. "A deep learning approach for identifying biomedical breakthrough discoveries using context analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5531-5549, July.
  • Handle: RePEc:spr:scient:v:126:y:2021:i:7:d:10.1007_s11192-021-04003-z
    DOI: 10.1007/s11192-021-04003-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-021-04003-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-021-04003-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Unknown, 1967. "Index," 1967 Conference, August 21-30, 1967, Sydney, New South Wales, Australia 209796, International Association of Agricultural Economists.
    2. Winnink, J.J. & Tijssen, Robert J.W. & van Raan, A.F.J., 2019. "Searching for new breakthroughs in science: How effective are computerised detection algorithms?," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 673-686.
    3. Chen, Chaomei & Chen, Yue & Horowitz, Mark & Hou, Haiyan & Liu, Zeyuan & Pellegrino, Donald, 2009. "Towards an explanatory and computational theory of scientific discovery," Journal of Informetrics, Elsevier, vol. 3(3), pages 191-209.
    4. Chaomei Chen, 2012. "Predictive effects of structural variation on citation counts," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(3), pages 431-449, March.
    5. Jos J. Winnink & Robert J. W. Tijssen & Anthony F. J. van Raan, 2016. "Theory‐changing breakthroughs in science: The impact of research teamwork on scientific discoveries," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(5), pages 1210-1223, May.
    6. Jordan A. Comins & Loet Leydesdorff, 2017. "Citation algorithms for identifying research milestones driving biomedical innovation," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(3), pages 1495-1504, March.
    7. Holly N. Wolcott & Matthew J. Fouch & Elizabeth R. Hsu & Leo G. DiJoseph & Catherine A. Bernaciak & James G. Corrigan & Duane E. Williams, 2016. "Modeling time-dependent and -independent indicators to facilitate identification of breakthrough research papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(2), pages 807-817, May.
    8. Jordan A. Comins & Thomas W. Hussey, 2015. "Detecting seminal research contributions to the development and use of the global positioning system by reference publication year spectroscopy," Scientometrics, Springer;Akadémiai Kiadó, vol. 104(2), pages 575-580, August.
    9. Chaomei Chen, 2012. "Predictive effects of structural variation on citation counts," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(3), pages 431-449, March.
    10. Saeed-Ul Hassan & Mubashir Imran & Sehrish Iqbal & Naif Radi Aljohani & Raheel Nawaz, 2018. "Deep context of citations using machine-learning models in scholarly full-text articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(3), pages 1645-1662, December.
    11. Ponomarev, Ilya V. & Williams, Duane E. & Hackett, Charles J. & Schnell, Joshua D. & Haak, Laurel L., 2014. "Predicting highly cited papers: A Method for Early Detection of Candidate Breakthroughs," Technological Forecasting and Social Change, Elsevier, vol. 81(C), pages 49-55.
    12. Aaron Elkiss & Siwei Shen & Anthony Fader & Güneş Erkan & David States & Dragomir Radev, 2008. "Blind men and elephants: What do citation summaries tell us about a research article?," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 59(1), pages 51-62, January.
    13. Small, Henry & Tseng, Hung & Patek, Mike, 2017. "Discovering discoveries: Identifying biomedical discoveries using citation contexts," Journal of Informetrics, Elsevier, vol. 11(1), pages 46-62.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Shiyun Wang & Yaxue Ma & Jin Mao & Yun Bai & Zhentao Liang & Gang Li, 2023. "Quantifying scientific breakthroughs by a novel disruption indicator based on knowledge entities," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 74(2), pages 150-167, February.
    2. Li Yao & He Ni, 2023. "Prediction of patent grant and interpreting the key determinants: an application of interpretable machine learning approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 4933-4969, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Min, Chao & Bu, Yi & Sun, Jianjun, 2021. "Predicting scientific breakthroughs based on knowledge structure variations," Technological Forecasting and Social Change, Elsevier, vol. 164(C).
    2. Xian Li & Ronald Rousseau & Liming Liang & Fangjie Xi & Yushuang Lü & Yifan Yuan & Xiaojun Hu, 2022. "Is low interdisciplinarity of references an unexpected characteristic of Nobel Prize winning research?," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 2105-2122, April.
    3. Holly N. Wolcott & Matthew J. Fouch & Elizabeth R. Hsu & Leo G. DiJoseph & Catherine A. Bernaciak & James G. Corrigan & Duane E. Williams, 2016. "Modeling time-dependent and -independent indicators to facilitate identification of breakthrough research papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(2), pages 807-817, May.
    4. Jianhua Hou & Bili Zheng & Yang Zhang & Chaomei Chen, 2021. "How do Price medalists’ scholarly impact change before and after their awards?," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5945-5981, July.
    5. Sehrish Iqbal & Saeed-Ul Hassan & Naif Radi Aljohani & Salem Alelyani & Raheel Nawaz & Lutz Bornmann, 2021. "A decade of in-text citation analysis based on natural language processing and machine learning techniques: an overview of empirical studies," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6551-6599, August.
    6. Li, Xin & Wen, Yang & Jiang, Jiaojiao & Daim, Tugrul & Huang, Lucheng, 2022. "Identifying potential breakthrough research: A machine learning method using scientific papers and Twitter data," Technological Forecasting and Social Change, Elsevier, vol. 184(C).
    7. J. J. Winnink & Robert J. W. Tijssen, 2014. "R&D dynamics and scientific breakthroughs in HIV/AIDS drugs development: the case of Integrase Inhibitors," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 1-16, October.
    8. Winnink, J.J. & Tijssen, Robert J.W. & van Raan, A.F.J., 2019. "Searching for new breakthroughs in science: How effective are computerised detection algorithms?," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 673-686.
    9. Ajiferuke, Isola & Famoye, Felix, 2015. "Modelling count response variables in informetric studies: Comparison among count, linear, and lognormal regression models," Journal of Informetrics, Elsevier, vol. 9(3), pages 499-513.
    10. Mingyang Wang & Zhenyu Wang & Guangsheng Chen, 2019. "Which can better predict the future success of articles? Bibliometric indices or alternative metrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1575-1595, June.
    11. Lanu Kim & Jason H. Portenoy & Jevin D. West & Katherine W. Stovel, 2020. "Scientific journals still matter in the era of academic search engines and preprint archives," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 71(10), pages 1218-1226, October.
    12. Zehra Taşkın, 2021. "Forecasting the future of library and information science and its sub-fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1527-1551, February.
    13. Wanjun Xia & Tianrui Li & Chongshou Li, 2023. "A review of scientific impact prediction: tasks, features and methods," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 543-585, January.
    14. Heng Huang & Donghua Zhu & Xuefeng Wang, 2022. "Evaluating scientific impact of publications: combining citation polarity and purpose," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5257-5281, September.
    15. Jingwei Han & Zhixiong Tan & Maozhi Chen & Liang Zhao & Ling Yang & Siying Chen, 2022. "Carbon Footprint Research Based on Input–Output Model—A Global Scientometric Visualization Analysis," IJERPH, MDPI, vol. 19(18), pages 1-23, September.
    16. Jianwei Qian & Huawen Shen & Rob Law, 2018. "Research in Sustainable Tourism: A Longitudinal Study of Articles between 2008 and 2017," Sustainability, MDPI, vol. 10(3), pages 1-13, February.
    17. Kaile Gong & Juan Xie & Ying Cheng & Vincent Larivière & Cassidy R. Sugimoto, 2019. "The citation advantage of foreign language references for Chinese social science papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(3), pages 1439-1460, September.
    18. Jos J. Winnink & Robert J. W. Tijssen & Anthony F. J. van Raan, 2016. "Theory‐changing breakthroughs in science: The impact of research teamwork on scientific discoveries," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(5), pages 1210-1223, May.
    19. Lv, Yanhua & Ding, Ying & Song, Min & Duan, Zhiguang, 2018. "Topology-driven trend analysis for drug discovery," Journal of Informetrics, Elsevier, vol. 12(3), pages 893-905.
    20. Iman Tahamtan & Askar Safipour Afshar & Khadijeh Ahamdzadeh, 2016. "Factors affecting number of citations: a comprehensive review of the literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1195-1225, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:126:y:2021:i:7:d:10.1007_s11192-021-04003-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.