IDEAS home Printed from https://ideas.repec.org/a/hin/complx/5554874.html
   My bibliography  Save this article

Exploiting Contextual Word Embedding of Authorship and Title of Articles for Discovering Citation Intent Classification

Author

Listed:
  • Muhammad Roman
  • Abdul Shahid
  • Muhammad Irfan Uddin
  • Qiaozhi Hua
  • Shazia Maqsood
  • Furqan Aziz

Abstract

The number of scientific publications is growing exponentially. Research articles cite other work for various reasons and, therefore, have been studied extensively to associate documents. It is argued that not all references carry the same level of importance. It is essential to understand the reason for citation, called citation intent or function. Text information can contribute well if new natural language processing techniques are applied to capture the context of text data. In this paper, we have used contextualized word embedding to find the numerical representation of text features. We further investigated the performance of various machine-learning techniques on the numerical representation of text. The performance of each of the classifiers was evaluated on two state-of-the-art datasets containing the text features. In the case of the unbalanced dataset, we observed that the linear Support Vector Machine (SVM) achieved 86% accuracy for the “background†class, where the training was extensive. For the rest of the classes, including “motivation,†“extension,†and “future,†the machine was trained on less than 100 records; therefore, the accuracy was only 57 to 64%. In the case of a balanced dataset, each of the classes has the same accuracy as trained on the same size of training data. Overall, SVM performed best on both of the datasets, followed by the stochastic gradient descent classifier; therefore, SVM can produce good results as text classification on top of contextual word embedding.

Suggested Citation

  • Muhammad Roman & Abdul Shahid & Muhammad Irfan Uddin & Qiaozhi Hua & Shazia Maqsood & Furqan Aziz, 2021. "Exploiting Contextual Word Embedding of Authorship and Title of Articles for Discovering Citation Intent Classification," Complexity, Hindawi, vol. 2021, pages 1-13, April.
  • Handle: RePEc:hin:complx:5554874
    DOI: 10.1155/2021/5554874
    as

    Download full text from publisher

    File URL: http://downloads.hindawi.com/journals/complexity/2021/5554874.pdf
    Download Restriction: no

    File URL: http://downloads.hindawi.com/journals/complexity/2021/5554874.xml
    Download Restriction: no

    File URL: https://libkey.io/10.1155/2021/5554874?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hin:complx:5554874. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Mohamed Abdelhakeem (email available below). General contact details of provider: https://www.hindawi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.