IDEAS home Printed from https://ideas.repec.org/a/spr/ijsaem/v15y2024i3d10.1007_s13198-023-02208-4.html
   My bibliography  Save this article

Deceptive opinion spam detection using feature reduction techniques

Author

Listed:
  • Sushil Kumar Maurya

    (Motilal Nehru National Institute of Technology Allahabad)

  • Dinesh Singh

    (Motilal Nehru National Institute of Technology Allahabad)

  • Ashish Kumar Maurya

    (Motilal Nehru National Institute of Technology Allahabad)

Abstract

People usually prepare themselves by reading online reviews before purchasing a product. Sellers sometimes try to imitate user experience as a deceptive review to increase profits. Deceptive opinion spam detection has emerged as a challenging task in the field of opinion mining. Feature reduction techniques play the most important role in data mining which finds the essential features and removes the unnecessary dimensions that only contribute to the noise. This article extracts various textual features of gold-standard deceptive hotel reviews using different representation techniques like Part of Speech tag (POS tag), Bag of Word (BoW), and Doc2Vec. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are applied to reduce the features' dimensions. Various supervised classifiers like Decision Tree (DT), Na¨ıve Bayes (NB), Logistic Regression (LR), and Support Vector Machine (SVM) are used to classify deceptive opinions and truthful opinions. The features used by these supervised classifiers cannot retain sequential information from reviews. To overcome this problem, we used the Words Attention-based Bidirectional Long Short-Term Memory (WABiLSTM) network model that trains to learn the patterns of words. The article examines machine and deep learning-based spam detection models and provides their outline and results. The metrics like accuracy, precision, recall, and F-Measure are used to analyze the performance of these classification models. The experimental results showed the model's performance improved after reducing the features.

Suggested Citation

  • Sushil Kumar Maurya & Dinesh Singh & Ashish Kumar Maurya, 2024. "Deceptive opinion spam detection using feature reduction techniques," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 15(3), pages 1210-1230, March.
  • Handle: RePEc:spr:ijsaem:v:15:y:2024:i:3:d:10.1007_s13198-023-02208-4
    DOI: 10.1007/s13198-023-02208-4
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13198-023-02208-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13198-023-02208-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Lean Yu & Rongtian Zhou & Rongda Chen & Kin Keung Lai, 2022. "Missing Data Preprocessing in Credit Classification: One-Hot Encoding or Imputation?," Emerging Markets Finance and Trade, Taylor & Francis Journals, vol. 58(2), pages 472-482, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Muhammad Adnan Aslam & Fiza Murtaza & Muhammad Ehatisham Ul Haq & Amanullah Yasin & Numan Ali, 2025. "SAPEx-D: A Comprehensive Dataset for Predictive Analytics in Personalized Education Using Machine Learning," Data, MDPI, vol. 10(3), pages 1-29, February.
    2. Yuetong Zhao & Deqin Lin, 2023. "Prediction of Micro- and Small-Sized Enterprise Default Risk Based on a Logistic Model: Evidence from a Bank of China," Sustainability, MDPI, vol. 15(5), pages 1-13, February.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:ijsaem:v:15:y:2024:i:3:d:10.1007_s13198-023-02208-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.