IDEAS home Printed from https://ideas.repec.org/a/gam/jdataj/v8y2023i4p68-d1111305.html
   My bibliography  Save this article

Sentiment Analysis of Multilingual Dataset of Bahraini Dialects, Arabic, and English

Author

Listed:
  • Thuraya Omran

    (Department of Computer Science, Brunel University London, Uxbridge UB8 3PH, UK)

  • Baraa Sharef

    (Department of Information Technology, College of Information Technology, Ahlia University, Manama P.O. Box 10878, Bahrain)

  • Crina Grosan

    (Division of Applied Technologies for Clinical Care, King’s College London, London WC2R 2LS, UK)

  • Yongmin Li

    (Department of Computer Science, Brunel University London, Uxbridge UB8 3PH, UK)

Abstract

Sentiment analysis is an application of natural language processing (NLP) that requires a machine learning algorithm and a dataset. In some cases, the dataset availability is scarce, particularly with Arabic dialects, precisely the Bahraini ones, which necessitates using an approach such as translation, where a rich source language is exploited to create the target language dataset. In this study, a dataset of Amazon product reviews in Bahraini dialects is presented. This dataset was generated using two cascading stages of translation—a machine translation followed by a manual one. Machine translation was applied using Google Translate to translate English Amazon product reviews into Standard Arabic. In contrast, the manual approach was applied to translate the resulting Arabic reviews into Bahraini ones by qualified native speakers utilizing constructed customized forms. The resulting parallel dataset of English, Standard Arabic, and Bahraini dialects is called English_Modern Standard Arabic_Bahraini Dialects product reviews for sentiment analysis “E_MSA_BDs-PR-SA”. The dataset is balanced, composed of 2500 positive and 2500 negative reviews. The sentiment analysis process was implemented using a stacked LSTM deep learning model. The Bahraini dialect product dataset can be utilized in the transfer learning process for sentimentally analyzing another dataset in Bahraini dialects.

Suggested Citation

  • Thuraya Omran & Baraa Sharef & Crina Grosan & Yongmin Li, 2023. "Sentiment Analysis of Multilingual Dataset of Bahraini Dialects, Arabic, and English," Data, MDPI, vol. 8(4), pages 1-13, March.
  • Handle: RePEc:gam:jdataj:v:8:y:2023:i:4:p:68-:d:1111305
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2306-5729/8/4/68/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2306-5729/8/4/68/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jdataj:v:8:y:2023:i:4:p:68-:d:1111305. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.