IDEAS home Printed from https://ideas.repec.org/a/igg/jirr00/v3y2013i4p35-51.html
   My bibliography  Save this article

Method of Lexical Enrichment in Information Retrieval System in Arabic

Author

Listed:
  • Souheyl Mallat

    (Department of Computer Sciences, University of Monastir, Monastir, Tunisia)

  • Anis Zouaghi

    (Department of Computer Sciences, Higher Institute of Applied Science and Technologies Sousse, Sousse University, Sousse, Tunisia)

  • Emna Hkiri

    (Department of Computer Sciences, University of Monastir, Monastir, Tunisia)

  • Mounir Zrigui

    (Department of Computer Sciences, University of Monastir, Monastir, Tunisia)

Abstract

In this paper, the authors propose a method for lexical enrichment of Arabic queries in order to improve the performance of the information retrieval systems SRI. This method has two types of enrichment: linguistic and contextual. The first one is based on the linguistic analysis (lemmatization, morphological, syntactic and semantic analysis), whose goal is to generate a descriptive list (list-desc). This list contains a set of linguistic lexicon assigned to each significant term in the query. The second enrichment consists in integrating contextual information derived from the corpus documents. It is based on statistical analysis using Salton weighting functions: TF-IDF and TF-IEF. The TF-IDF function is applied on the list-desc and documents in the corpus in order to identify relevant documents. TF-IEF function is made between the list-desc and sentences belonging to the relevant documents to identify relevant sentences. Then, terms in these sentences are weighted, and those with highest weights are considered rich in terms of informative and contextual importance are added to the original query. The authors' lexical enrichment method was evaluated on a corpus of documents belonging to a specialized domain and results show its interest in terms of precision and recall.

Suggested Citation

  • Souheyl Mallat & Anis Zouaghi & Emna Hkiri & Mounir Zrigui, 2013. "Method of Lexical Enrichment in Information Retrieval System in Arabic," International Journal of Information Retrieval Research (IJIRR), IGI Global, vol. 3(4), pages 35-51, October.
  • Handle: RePEc:igg:jirr00:v:3:y:2013:i:4:p:35-51
    as

    Download full text from publisher

    File URL: http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/ijirr.2013100103
    Download Restriction: no
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:igg:jirr00:v:3:y:2013:i:4:p:35-51. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Journal Editor (email available below). General contact details of provider: https://www.igi-global.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.