IDEAS home Printed from https://ideas.repec.org/a/igg/jirr00/v1y2011i3p54-70.html
   My bibliography  Save this article

The Effect of Stemming on Arabic Text Classification: An Empirical Study

Author

Listed:
  • Abdullah Wahbeh

    (Dakota State University, USA)

  • Mohammed Al-Kabi

    (Yarmouk University, Jordan)

  • Qasem Al-Radaideh

    (Yarmouk University, Jordan)

  • Emad Al-Shawakfa

    (Yarmouk University, Jordan)

  • Izzat Alsmadi

    (Yarmouk University, Jordan)

Abstract

The information world is rich of documents in different formats or applications, such as databases, digital libraries, and the Web. Text classification is used for aiding search functionality offered by search engines and information retrieval systems to deal with the large number of documents on the web. Many research papers, conducted within the field of text classification, were applied to English, Dutch, Chinese, and other languages, whereas fewer were applied to Arabic language. This paper addresses the issue of automatic classification or classification of Arabic text documents. It applies text classification to Arabic language text documents using stemming as part of the preprocessing steps. Results have showed that applying text classification without using stemming; the support vector machine (SVM) classifier has achieved the highest classification accuracy using the two test modes with 87.79% and 88.54%. On the other hand, stemming has negatively affected the accuracy, where the SVM accuracy using the two test modes dropped down to 84.49% and 86.35%.

Suggested Citation

  • Abdullah Wahbeh & Mohammed Al-Kabi & Qasem Al-Radaideh & Emad Al-Shawakfa & Izzat Alsmadi, 2011. "The Effect of Stemming on Arabic Text Classification: An Empirical Study," International Journal of Information Retrieval Research (IJIRR), IGI Global, vol. 1(3), pages 54-70, July.
  • Handle: RePEc:igg:jirr00:v:1:y:2011:i:3:p:54-70
    as

    Download full text from publisher

    File URL: http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/ijirr.2011070104
    Download Restriction: no
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:igg:jirr00:v:1:y:2011:i:3:p:54-70. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Journal Editor (email available below). General contact details of provider: https://www.igi-global.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.