IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v18y2024i4s1751157724000634.html
   My bibliography  Save this article

Exploring motivations for algorithm mention in the domain of natural language processing: A deep learning approach

Author

Listed:
  • Wang, Yuzhuo
  • Xiang, Yi
  • Zhang, Chengzhi

Abstract

With the formation of the fourth paradigm of scientific research, algorithms have become increasingly important in scientific research. In academic papers, algorithms may be mentioned by scholars with various motivations, using, comparing, or improving algorithms to solve complex research tasks. Identifying these motivations can help scholars discover the relationships between algorithms and further assess their roles and values. Therefore, taking the field of natural language processing (NLP) as an example, this article proposes a complete method to conduct the identification, distribution, and evolution of motivations for mentioning algorithms at the sentence level. Specifically, using manual annotation and machine learning methods, we identify algorithm entities and sentences in the full text of papers, classify motivations for mentioning algorithms by pre-training models and data augmentation techniques, and finally analyze the distribution and evolution of motivations. The results show that the deep learning models trained with the augmented data outperform the traditional machine learning models in the classification task. In academic papers, more than half of the sentences show the direct use of algorithms, while the lowest percentage of motivations are improving algorithms, and the diversity of motivations has been increasing with time. For specific algorithms, grammatical algorithms are mentioned more by the motivation of “description,” while more motivations of “use” are found in the machine learning algorithms category. As time passed, the “use” motivations gradually replaced the “description” motivations for different algorithms, and the number of motivation types decreased significantly. Our research explores the identification, distribution, and evolution of authors’ motivations for mentioning algorithm entities, which could provide a basis for future algorithm relationship identification and influence evaluation using motivations.

Suggested Citation

  • Wang, Yuzhuo & Xiang, Yi & Zhang, Chengzhi, 2024. "Exploring motivations for algorithm mention in the domain of natural language processing: A deep learning approach," Journal of Informetrics, Elsevier, vol. 18(4).
  • Handle: RePEc:eee:infome:v:18:y:2024:i:4:s1751157724000634
    DOI: 10.1016/j.joi.2024.101550
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157724000634
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2024.101550?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Yifan Qian & Wenge Rong & Nan Jiang & Jie Tang & Zhang Xiong, 2017. "Citation regression analysis of computer science publications in different ranking categories and subfields," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(3), pages 1351-1374, March.
    2. Yang Zhang & Rongying Zhao & Yufei Wang & Haihua Chen & Adnan Mahmood & Munazza Zaib & Wei Emma Zhang & Quan Z. Sheng, 2022. "Correction to: Towards employing native information in citation function classification," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6579-6579, November.
    3. Yang Zhang & Rongying Zhao & Yufei Wang & Haihua Chen & Adnan Mahmood & Munazza Zaib & Wei Emma Zhang & Quan Z. Sheng, 2022. "Towards employing native information in citation function classification," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6557-6577, November.
    4. Hou, Jianhua & Tang, Shiqi & Zhang, Yang & Song, Haoyang, 2023. "Does prior knowledge affect patent technology diffusion? A semantic-based patent citation contribution analysis," Journal of Informetrics, Elsevier, vol. 17(2).
    5. Wang, Yuzhuo & Zhang, Chengzhi, 2020. "Using the full-text content of academic articles to identify and evaluate algorithm entities in the domain of natural language processing," Journal of Informetrics, Elsevier, vol. 14(4).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chengzhi Zhang & Philipp Mayr & Wei Lu & Yi Zhang, 2024. "An editorial note on extraction and evaluation of knowledge entities from scientific documents," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(11), pages 7169-7174, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiaorui Jiang & Jingqiang Chen, 2023. "Contextualised segment-wise citation function classification," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 5117-5158, September.
    2. Ruihua Qi & Jia Wei & Zhen Shao & Zhengguang Li & Heng Chen & Yunhao Sun & Shaohua Li, 2023. "Multi-task learning model for citation intent classification in scientific publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(12), pages 6335-6355, December.
    3. Indra Budi & Yaniasih Yaniasih, 2023. "Understanding the meanings of citations using sentiment, role, and citation function classifications," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 735-759, January.
    4. Li, Xin & Tang, Xuli & Lu, Wei, 2024. "Investigating clinical links in edge-labeled citation networks of biomedical research: A translational science perspective," Journal of Informetrics, Elsevier, vol. 18(3).
    5. Yi Zhang & Chengzhi Zhang & Philipp Mayr & Arho Suominen, 2022. "An editorial of “AI + informetrics”: multi-disciplinary interactions in the era of big data," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6503-6507, November.
    6. Xiaorui Jiang, 2025. "Ensembling approaches to citation function classification and important citation screening," Scientometrics, Springer;Akadémiai Kiadó, vol. 130(3), pages 1371-1419, March.
    7. Krittin Chatrinan & Thanapon Noraset & Suppawong Tuarob, 2025. "GAN-CITE: leveraging semi-supervised generative adversarial networks for citation function classification with limited data," Scientometrics, Springer;Akadémiai Kiadó, vol. 130(2), pages 679-703, February.
    8. Percia David, Dimitri & Maréchal, Loïc & Lacube, William & Gillard, Sébastien & Tsesmelis, Michael & Maillart, Thomas & Mermoud, Alain, 2023. "Measuring security development in information technologies: A scientometric framework using arXiv e-prints," Technological Forecasting and Social Change, Elsevier, vol. 188(C).
    9. Dongin Nam & Jiwon Kim & Jeeyoung Yoon & Chaemin Song & Seongdeok Kim & Min Song, 2024. "Examining knowledge entities and its relationships based on citation sentences using a multi-anchor bipartite network," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(11), pages 7197-7228, November.
    10. Pilar Valderrama & Manuel Escabias & Evaristo Jiménez-Contreras & Mariano J. Valderrama & Pilar Baca, 2018. "A mixed longitudinal and cross-sectional model to forecast the journal impact factor in the field of Dentistry," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 1203-1212, August.
    11. Cristina López-Duarte & Marta M. Vidal-Suárez & Belén González-Díaz, 2019. "Cross-national distance and international business: an analysis of the most influential recent models," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 173-208, October.
    12. Chen, Xi & Mao, Jin & Li, Gang, 2024. "A co-citation approach to the analysis on the interaction between scientific and technological knowledge," Journal of Informetrics, Elsevier, vol. 18(3).
    13. Yuzhuo Wang & Chengzhi Zhang & Kai Li, 2022. "A review on method entities in the academic literature: extraction, evaluation, and application," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2479-2520, May.
    14. Kayvan Kousha & Mike Thelwall, 2024. "Factors associating with or predicting more cited or higher quality journal articles: An Annual Review of Information Science and Technology (ARIST) paper," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 75(3), pages 215-244, March.
    15. Xingyu Gao & Qiang Wu & Yuanyuan Liu & Ruilu Yang, 2024. "Pasteur’s quadrant in AI: do patent-cited papers have higher scientific impact?," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(2), pages 909-932, February.
    16. Wang, Zhenhua & Ren, Ming & Gao, Dong & Li, Zhuang, 2023. "A Zipf's law-based text generation approach for addressing imbalance in entity extraction," Journal of Informetrics, Elsevier, vol. 17(4).
    17. Meho, Lokman I., 2019. "Using Scopus’s CiteScore for assessing the quality of computer science conferences," Journal of Informetrics, Elsevier, vol. 13(1), pages 419-433.
    18. Juan Xie & Kaile Gong & Jiang Li & Qing Ke & Hyonchol Kang & Ying Cheng, 2019. "A probe into 66 factors which are possibly associated with the number of citations an article received," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1429-1454, June.
    19. Yuzhuo Wang & Kai Li, 2024. "How do official software citation formats evolve over time? A longitudinal analysis of R programming language packages," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(7), pages 3997-4019, July.
    20. Arash Moghadasi, 2024. "Do SMEs Consider Open Data as a Vital Intellectual Asset? a Systematic Literature Review," Journal of the Knowledge Economy, Springer;Portland International Center for Management of Engineering and Technology (PICMET), vol. 15(3), pages 11784-11818, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:18:y:2024:i:4:s1751157724000634. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.