IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v15y2023i9p7076-d1130832.html
   My bibliography  Save this article

ERF-XGB: Ensemble Random Forest-Based XG Boost for Accurate Prediction and Classification of E-Commerce Product Review

Author

Listed:
  • Daniyal M. Alghazzawi

    (Department of Information Systems, College of Computer Sciences and Information Technology, King Abdulaziz University, Jeddah 80200, Saudi Arabia)

  • Anser Ghazal Ali Alquraishee

    (Department of Information Systems, College of Computer Sciences and Information Technology, King Abdulaziz University, Jeddah 80200, Saudi Arabia)

  • Sahar K. Badri

    (Department of Information Systems, College of Computer Sciences and Information Technology, King Abdulaziz University, Jeddah 80200, Saudi Arabia)

  • Syed Hamid Hasan

    (Department of Information Systems, College of Computer Sciences and Information Technology, King Abdulaziz University, Jeddah 80200, Saudi Arabia)

Abstract

Recently, the concept of e-commerce product review evaluation has become a research topic of significant interest in sentiment analysis. The sentiment polarity estimation of product reviews is a great way to obtain a buyer’s opinion on products. It offers significant advantages for online shopping customers to evaluate the service and product qualities of the purchased products. However, the issues related to polysemy, disambiguation, and word dimension mapping create prediction problems in analyzing online reviews. In order to address such issues and enhance the sentiment polarity classification, this paper proposes a new sentiment analysis model, the Ensemble Random Forest-based XG boost (ERF-XGB) approach, for the accurate binary classification of online e-commerce product review sentiments. Two different Internet Movie Database (IMDB) datasets and the Chinese Emotional Corpus (ChnSentiCorp) dataset are used for estimating online reviews. First, the datasets are preprocessed through tokenization, lemmatization, and stemming operations. The Harris hawk optimization (HHO) algorithm selects two datasets’ corresponding features. Finally, the sentiments from online reviews are classified into positive and negative categories regarding the proposed ERF-XGB approach. Hyperparameter tuning is used to find the optimal parameter values that improve the performance of the proposed ERF-XGB algorithm. The performance of the proposed ERF-XGB approach is analyzed using evaluation indicators, namely accuracy, recall, precision, and F1-score, for different existing approaches. Compared with the existing method, the proposed ERF-XGB approach effectively predicts sentiments of online product reviews with an accuracy rate of about 98.7% for the ChnSentiCorp dataset and 98.2% for the IMDB dataset.

Suggested Citation

  • Daniyal M. Alghazzawi & Anser Ghazal Ali Alquraishee & Sahar K. Badri & Syed Hamid Hasan, 2023. "ERF-XGB: Ensemble Random Forest-Based XG Boost for Accurate Prediction and Classification of E-Commerce Product Review," Sustainability, MDPI, vol. 15(9), pages 1-14, April.
  • Handle: RePEc:gam:jsusta:v:15:y:2023:i:9:p:7076-:d:1130832
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/15/9/7076/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/15/9/7076/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:15:y:2023:i:9:p:7076-:d:1130832. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.