IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v128y2023i12d10.1007_s11192-023-04851-x.html
   My bibliography  Save this article

Identifying the driving factors of word co-occurrence: a perspective of semantic relations

Author

Listed:
  • Yiming Zhao

    (Wuhan University
    Wuhan University
    Wuhan University)

  • Jiaying Yin

    (Wuhan University)

  • Jin Zhang

    (University of Wisconsin Milwaukee)

  • Linrong Wu

    (Wuhan University)

Abstract

This study aims to investigate and identify the driving factors of word co-occurrence from the perspective of semantic relations between frequently co-occurring words. Natural sentences in a corpus of news articles were used as co-occurrence windows to extract co-occurring word pairs, and the distance of those two words was not limited. ConceptNet (a semantic knowledge base) was used to annotate the semantic relation between co-occurring words. To solve the problem that some co-occurring word pairs fail to match direct semantic relations in ConceptNet, we proposed a relation annotation method by connecting them with an intermediate word. Results showed that six semantic relations in ConceptNet, (i.e., RelatedTo, IsA, Synonym, HasContext, Antonym, and MannerOf) were important factors directly inducing word co-occurrence. The combination of some of those semantic relations was an important factor indirectly driving word co-occurrence. Also, syntactic analysis and lexical semantic theories were combined to analyze the direct and indirect semantic relations. In this analysis, we found that the factors driving word co-occurrence in sentences could be classified into three relation categories: collocation and modification, hyponymy, and synonym and antonym. These findings can help explain the phenomenon of word co-occurrence and improve the method and application of co-word analysis.

Suggested Citation

  • Yiming Zhao & Jiaying Yin & Jin Zhang & Linrong Wu, 2023. "Identifying the driving factors of word co-occurrence: a perspective of semantic relations," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(12), pages 6471-6494, December.
  • Handle: RePEc:spr:scient:v:128:y:2023:i:12:d:10.1007_s11192-023-04851-x
    DOI: 10.1007/s11192-023-04851-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-023-04851-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-023-04851-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:128:y:2023:i:12:d:10.1007_s11192-023-04851-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.