IDEAS home Printed from https://ideas.repec.org/a/pal/jmarka/v13y2025i2d10.1057_s41270-025-00402-w.html
   My bibliography  Save this article

Topic classification of vietnamese product reviews in e-commerce using PhoBERT

Author

Listed:
  • Tuan Duy Nguyen

    (Department of Mathematical Economics, National Economics University)

  • Duc Minh Nguyen

    (Department of Mathematical Economics, National Economics University)

  • Huu Manh Nguyen

    (Deakin University)

  • Thi Quynh Giang Nguyen

    (Department of Mathematical Economics, National Economics University)

Abstract

Integrating customer feedback into product refinement is critical for businesses aiming to improve product offerings, enhance customer experiences, and boost revenue. While user-generated reviews on e-commerce platforms provide timely and unbiased feedback, the unstructured nature and high volume of such textual data pose significant challenges for extracting actionable insights. This study investigates the application of natural language processing (NLP) techniques, specifically topic modeling and text classification, to address this issue. In the first phase of the study, we employed Latent Dirichlet Allocation (LDA) and BERTopic to identify latent topics within customer reviews, providing a thematic overview of customer discussions. Due to the bad performance of these models, manual labeling was introduced in the second phase based on the topics identified in the initial step. The final classification model was developed using PhoBERT embeddings combined with Logistic Regression. The experiments were conducted on a dataset of 17,002 Vietnamese customer reviews from Shopee, one of Vietnam’s largest e-commerce platforms. The model successfully categorized reviews into five primary topics: Product Quality, Customer Service, Price, Shipping, and Packaging, achieving an F1-score of 0.96 and a Hamming Loss of 0.022. This result helps e-commerce managers identify customer issues, allowing for effective improvements in the buying experience.

Suggested Citation

  • Tuan Duy Nguyen & Duc Minh Nguyen & Huu Manh Nguyen & Thi Quynh Giang Nguyen, 2025. "Topic classification of vietnamese product reviews in e-commerce using PhoBERT," Journal of Marketing Analytics, Palgrave Macmillan, vol. 13(2), pages 371-385, June.
  • Handle: RePEc:pal:jmarka:v:13:y:2025:i:2:d:10.1057_s41270-025-00402-w
    DOI: 10.1057/s41270-025-00402-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1057/s41270-025-00402-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1057/s41270-025-00402-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pal:jmarka:v:13:y:2025:i:2:d:10.1057_s41270-025-00402-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.palgrave-journals.com/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.