IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v15y2023i5p3919-d1075774.html
   My bibliography  Save this article

Research on the Automatic Subject-Indexing Method of Academic Papers Based on Climate Change Domain Ontology

Author

Listed:
  • Heng Yang

    (Chinese Academy of Sciences, Northwest Institute of Eco-Environment and Resources, Lanzhou 730000, China)

  • Nan Wang

    (Chinese Academy of Sciences, Northwest Institute of Eco-Environment and Resources, Lanzhou 730000, China)

  • Lina Yang

    (Chinese Academy of Sciences, Northwest Institute of Eco-Environment and Resources, Lanzhou 730000, China)

  • Wei Liu

    (Chinese Academy of Sciences, Northwest Institute of Eco-Environment and Resources, Lanzhou 730000, China)

  • Sili Wang

    (Chinese Academy of Sciences, Northwest Institute of Eco-Environment and Resources, Lanzhou 730000, China)

Abstract

It is important to classify academic papers in a fine-grained manner to uncover deeper implicit themes and semantics in papers for better semantic retrieval, paper recommendation, research trend prediction, topic analysis, and a series of other functions. Based on the ontology of the climate change domain, this study used an unsupervised approach to combine two methods, syntactic structure and semantic modeling, to build a framework of subject-indexing techniques for academic papers in the climate change domain. The framework automatically indexes a set of conceptual terms as research topics from the domain ontology by inputting the titles, abstracts and keywords of the papers using natural language processing techniques such as syntactic dependencies, text similarity calculation, pre-trained language models, semantic similarity calculation, and weighting factors such as word frequency statistics and graph path calculation. Finally, we evaluated the proposed method using the gold standard of manually annotated articles and demonstrated significant improvements over the other five alternative methods in terms of precision, recall and F1-score. Overall, the method proposed in this study is able to identify the research topics of academic papers more accurately, and also provides useful references for the application of domain ontologies and unsupervised data annotation.

Suggested Citation

  • Heng Yang & Nan Wang & Lina Yang & Wei Liu & Sili Wang, 2023. "Research on the Automatic Subject-Indexing Method of Academic Papers Based on Climate Change Domain Ontology," Sustainability, MDPI, vol. 15(5), pages 1-13, February.
  • Handle: RePEc:gam:jsusta:v:15:y:2023:i:5:p:3919-:d:1075774
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/15/5/3919/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/15/5/3919/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Kevin W. Boyack & Richard Klavans, 2014. "Creation of a highly detailed, dynamic, global model and map of science," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(4), pages 670-685, April.
    2. Jean Vincent Fonou-Dombeu & Nadia Naidoo & Micara Ramnanan & Rachan Gowda & Sahil Ramkaran Lawton, 2021. "OntoCSA: A Climate-Smart Agriculture Ontology," International Journal of Agricultural and Environmental Information Systems (IJAEIS), IGI Global, vol. 12(4), pages 1-20, October.
    3. Jianhua Hou & Xiucai Yang & Chaomei Chen, 2018. "Emerging trends and new developments in information science: a document co-citation analysis (2009–2016)," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(2), pages 869-892, May.
    4. Iqra Safder & Saeed-Ul Hassan, 2019. "Bibliometric-enhanced information retrieval: a novel deep feature engineering approach for algorithm searching from full-text publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(1), pages 257-277, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Carolina Navarro-Lopez & Salvador Linares-Mustaros & Carles Mulet-Forteza, 2022. "“The Statistical Analysis of Compositional Data†by John Aitchison (1986): A Bibliometric Overview," SAGE Open, , vol. 12(2), pages 21582440221, April.
    2. Yanto Chandra, 2018. "Mapping the evolution of entrepreneurship as a field of research (1990–2013): A scientometric analysis," PLOS ONE, Public Library of Science, vol. 13(1), pages 1-24, January.
    3. Gao, Qiang & Liang, Zhentao & Wang, Ping & Hou, Jingrui & Chen, Xiuxiu & Liu, Manman, 2021. "Potential index: Revealing the future impact of research topics based on current knowledge networks," Journal of Informetrics, Elsevier, vol. 15(3).
    4. Shuo Xu & Liyuan Hao & Xin An & Hongshen Pang & Ting Li, 2020. "Review on emerging research topics with key-route main path analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(1), pages 607-624, January.
    5. Wang, Xiaoguang & He, Jing & Huang, Han & Wang, Hongyu, 2022. "MatrixSim: A new method for detecting the evolution paths of research topics," Journal of Informetrics, Elsevier, vol. 16(4).
    6. Lu Huang & Xiang Chen & Yi Zhang & Changtian Wang & Xiaoli Cao & Jiarun Liu, 2022. "Identification of topic evolution: network analytics with piecewise linear representation and word embedding," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5353-5383, September.
    7. June Young Lee & Sejung Ahn & Dohyun Kim, 2021. "Deep learning-based prediction of future growth potential of technologies," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-16, June.
    8. Minxi Wang & Ping Liu & Zhaoliang Gu & Hong Cheng & Xin Li, 2019. "A Scientometric Review of Resource Recycling Industry," IJERPH, MDPI, vol. 16(23), pages 1-18, November.
    9. Balázs Győrffy & Andrea Magda Nagy & Péter Herman & Ádám Török, 2018. "Factors influencing the scientific performance of Momentum grant holders: an evaluation of the first 117 research groups," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 409-426, October.
    10. Naif Radi Aljohani & Ayman Fayoumi & Saeed-Ul Hassan, 2021. "An in-text citation classification predictive model for a scholarly search system," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5509-5529, July.
    11. Hong Shi & Mengmeng Cheng & Yi Feng & Chenghui Qiu & Caiyue Song & Nenglin Yuan & Chuanzhi Kang & Kaijie Yang & Jie Yuan & Yonghao Li, 2023. "Thermal Management Techniques for Lithium-Ion Batteries Based on Phase Change Materials: A Systematic Review and Prospective Recommendations," Energies, MDPI, vol. 16(2), pages 1-23, January.
    12. Andrej Kastrin & Dimitar Hristovski, 2021. "Scientometric analysis and knowledge mapping of literature-based discovery (1986–2020)," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1415-1451, February.
    13. Francisco Diez-Martin & Alicia Blanco-Gonzalez & Camilo Prado-Roman, 2019. "Research Challenges in Digital Marketing: Sustainability," Sustainability, MDPI, vol. 11(10), pages 1-13, May.
    14. Francisco Díez-Martín & Alicia Blanco-González & Camilo Prado-Román, 2021. "The intellectual structure of organizational legitimacy research: a co-citation analysis in business journals," Review of Managerial Science, Springer, vol. 15(4), pages 1007-1043, May.
    15. Orlando Fonseca Guilarte & Simone Diniz Junqueira Barbosa & Sinesio Pesco, 2021. "RelPath: an interactive tool to visualize branches of studies and quantify the expertise of authors by citation paths," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(6), pages 4871-4897, June.
    16. Mauricio Marrone, 2020. "Application of entity linking to identify research fronts and trends," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(1), pages 357-379, January.
    17. Sandeep Soni & Kristina Lerman & Jacob Eisenstein, 2021. "Follow the leader: Documents on the leading edge of semantic change get more citations," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(4), pages 478-492, April.
    18. Pinho, Celso R.A. & Pinho, Maria Luiza C.A. & Deligonul, Seyda Z. & Tamer Cavusgil, S., 2022. "The agility construct in the literature: Conceptualization and bibliometric assessment," Journal of Business Research, Elsevier, vol. 153(C), pages 517-532.
    19. Fatima Afzal & Roksana Jahan Tumpa, 2024. "Exploring Leadership Styles to Foster Sustainability in Construction Projects: A Systematic Literature Review," Sustainability, MDPI, vol. 16(3), pages 1-32, January.
    20. Boyack, Kevin W. & Klavans, Richard, 2014. "Including cited non-source items in a large-scale map of science: What difference does it make?," Journal of Informetrics, Elsevier, vol. 8(3), pages 569-580.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:15:y:2023:i:5:p:3919-:d:1075774. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.