IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v17y2020i8p2687-d345294.html
   My bibliography  Save this article

Improving the Named Entity Recognition of Chinese Electronic Medical Records by Combining Domain Dictionary and Rules

Author

Listed:
  • Xianglong Chen

    (School of Computer, University of South China, Hengyang 421001, China)

  • Chunping Ouyang

    (School of Computer, University of South China, Hengyang 421001, China)

  • Yongbin Liu

    (School of Computer, University of South China, Hengyang 421001, China)

  • Yi Bu

    (Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408, USA)

Abstract

Electronic medical records are an integral part of medical texts. Entity recognition of electronic medical records has triggered many studies that propose many entity extraction methods. In this paper, an entity extraction model is proposed to extract entities from Chinese Electronic Medical Records (CEMR). In the input layer of the model, we use word embedding and dictionary features embedding as input vectors, where word embedding consists of a character representation and a word representation. Then, the input vectors are fed to the bidirectional long short-term memory to capture contextual features. Finally, a conditional random field is employed to capture dependencies between neighboring tags. We performed experiments on body classification task, and the F1 values reached 90.65%. We also performed experiments on anatomic region recognition task, and the F1 values reached 93.89%. On both tasks, our model had higher performance than state-of-the-art models, such as Bi-LSTM-CRF, Bi-LSTM-Attention, and Vote. Through experiments, our model has a good effect when dealing with small frequency entities and unknown entities; with a small training dataset, our method showed 2–4% improvement on F1 value compared to the basic Bi-LSTM-CRF models. Additionally, on anatomic region recognition task, besides using our proposed entity extraction model, 12 rules we designed and domain dictionary were adopted. Then, in this task, the weighted F1 value of the three specific entities extraction reached 84.36%.

Suggested Citation

  • Xianglong Chen & Chunping Ouyang & Yongbin Liu & Yi Bu, 2020. "Improving the Named Entity Recognition of Chinese Electronic Medical Records by Combining Domain Dictionary and Rules," IJERPH, MDPI, vol. 17(8), pages 1-16, April.
  • Handle: RePEc:gam:jijerp:v:17:y:2020:i:8:p:2687-:d:345294
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/17/8/2687/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/17/8/2687/
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Senqi Yang & Xuliang Duan & Zeyan Xiao & Zhiyao Li & Yuhai Liu & Zhihao Jie & Dezhao Tang & Hui Du, 2022. "Sentiment Classification of Chinese Tourism Reviews Based on ERNIE-Gram+GCN," IJERPH, MDPI, vol. 19(20), pages 1-20, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:17:y:2020:i:8:p:2687-:d:345294. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.