IDEAS home Printed from https://ideas.repec.org/a/gam/jftint/v14y2022i2p39-d734021.html
   My bibliography  Save this article

Coarse-to-Fine Entity Alignment for Chinese Heterogeneous Encyclopedia Knowledge Base

Author

Listed:
  • Meng Wu

    (Ministry of Education Key Laboratory of Knowledge Engineering with Big Data, School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China)

  • Tingting Jiang

    (Ministry of Education Key Laboratory of Knowledge Engineering with Big Data, School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China)

  • Chenyang Bu

    (Ministry of Education Key Laboratory of Knowledge Engineering with Big Data, School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China)

  • Bin Zhu

    (Anhui Province Key Laboratory of Infrared and Low Temperature Plasma, College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China)

Abstract

Entity alignment (EA) aims to automatically determine whether an entity pair in different knowledge bases or knowledge graphs refer to the same entity in reality. Inspired by human cognitive mechanisms, we propose a coarse-to-fine entity alignment model (called CFEA) consisting of three stages: coarse-grained, middle-grained, and fine-grained. In the coarse-grained stage, a pruning strategy based on the restriction of entity types is adopted to reduce the number of candidate matching entities. The goal of this stage is to filter out pairs of entities that are clearly not the same entity. In the middle-grained stage, we calculate the similarity of entity pairs through some key attribute values and matched attribute values, the goal of which is to identify the entity pairs that are obviously not the same entity or are obviously the same entity. After this step, the number of candidate entity pairs is further reduced. In the fine-grained stage, contextual information, such as abstract and description text, is considered, and topic modeling is carried out to achieve more accurate matching. The basic idea of this stage is to use more information to help judge entity pairs that are difficult to distinguish using basic information from the first two stages. The experimental results on real-world datasets verify the effectiveness of our model compared with baselines.

Suggested Citation

  • Meng Wu & Tingting Jiang & Chenyang Bu & Bin Zhu, 2022. "Coarse-to-Fine Entity Alignment for Chinese Heterogeneous Encyclopedia Knowledge Base," Future Internet, MDPI, vol. 14(2), pages 1-19, January.
  • Handle: RePEc:gam:jftint:v:14:y:2022:i:2:p:39-:d:734021
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1999-5903/14/2/39/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1999-5903/14/2/39/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:14:y:2022:i:2:p:39-:d:734021. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.