IDEAS home Printed from https://ideas.repec.org/a/inm/orijds/v1y2022i1p81-95.html
   My bibliography  Save this article

HeBERT and HebEMO: A Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition

Author

Listed:
  • Avihay Chriqui

    (Coller School of Management, Tel Aviv University, Tel Aviv 6997801, Israel)

  • Inbal Yahav

    (Coller School of Management, Tel Aviv University, Tel Aviv 6997801, Israel)

Abstract

Sentiment analysis of user-generated content (UGC) can provide valuable information across numerous domains, including marketing, psychology, and public health. Currently, there are very few Hebrew models for natural language processing in general, and for sentiment analysis in particular; indeed, it is not straightforward to develop such models because Hebrew is a morphologically rich language (MRL) with challenging characteristics. Moreover, the only available Hebrew sentiment analysis model, based on a recurrent neural network, was developed for polarity analysis (classifying text as positive, negative, or neutral) and was not used for detection of finer-grained emotions (e.g., anger, fear, or joy). To address these gaps, this paper introduces HeBERT and HebEMO. HeBERT is a transformer-based model for modern Hebrew text, which relies on a BERT (bidirectional encoder representations from transformers) architecture. BERT has been shown to outperform alternative architectures in sentiment analysis and is suggested to be particularly appropriate for MRLs. Analyzing multiple BERT specifications, we find that whereas model complexity correlates with high performance on language tasks that aim to understand terms in a sentence, a more parsimonious model better captures the sentiment of an entire sentence. Notably, regardless of the complexity of the BERT specification, our BERT-based language model outperforms all existing Hebrew alternatives on all language tasks examined. HebEMO is a tool that uses HeBERT to detect polarity and extract emotions from Hebrew UGC. HebEMO is trained on a unique COVID-19-related UGC data set that we collected and annotated for this study. Data collection and annotation followed an active learning procedure that aimed to maximize predictability. We show that HebEMO yields a better performance accuracy for polarity classification. Emotion detection reaches high performance for various target emotions, with the exception of surprise, which the model failed to capture. These results are better than the best reported performance, even among English-language models of emotion detection.

Suggested Citation

  • Avihay Chriqui & Inbal Yahav, 2022. "HeBERT and HebEMO: A Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition," INFORMS Joural on Data Science, INFORMS, vol. 1(1), pages 81-95, April.
  • Handle: RePEc:inm:orijds:v:1:y:2022:i:1:p:81-95
    DOI: 10.1287/ijds.2022.0016
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/ijds.2022.0016
    Download Restriction: no

    File URL: https://libkey.io/10.1287/ijds.2022.0016?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Panagiotis Adamopoulos & Anindya Ghose & Vilma Todri, 2018. "The Impact of User Personality Traits on Word of Mouth: Text-Mining Social Media Platforms," Information Systems Research, INFORMS, vol. 29(3), pages 612-640, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kunpeng Zhang & Wendy Moe, 2021. "Measuring Brand Favorability Using Large-Scale Social Media Data," Information Systems Research, INFORMS, vol. 32(4), pages 1128-1139, December.
    2. Uttara Ananthakrishnan & Davide Proserpio & Siddhartha Sharma, 2023. "I Hear You: Does Quality Improve with Customer Voice?," Marketing Science, INFORMS, vol. 42(6), pages 1143-1161, November.
    3. Tamilla Triantoro & Ram Gopal & Raquel Benbunan-Fich & Guido Lang, 0. "Personality and games: enhancing online surveys through gamification," Information Technology and Management, Springer, vol. 0, pages 1-10.
    4. Kai Yang & Raymond Y. K. Lau & Ahmed Abbasi, 2023. "Getting Personal: A Deep Learning Artifact for Text-Based Measurement of Personality," Information Systems Research, INFORMS, vol. 34(1), pages 194-222, March.
    5. Tamilla Triantoro & Ram Gopal & Raquel Benbunan-Fich & Guido Lang, 2020. "Personality and games: enhancing online surveys through gamification," Information Technology and Management, Springer, vol. 21(3), pages 169-178, September.
    6. Bongsug (Kevin) Chae & Gyuhyeong Goh, 2020. "Digital Entrepreneurs in Artificial Intelligence and Data Analytics: Who Are They?," JOItmC, MDPI, vol. 6(3), pages 1-15, July.
    7. Hyelim Oh & Khim-Yong Goh & Tuan Q. Phan, 2023. "Are You What You Tweet? The Impact of Sentiment on Digital News Consumption and Social Media Sharing," Information Systems Research, INFORMS, vol. 34(1), pages 111-136, March.
    8. Grewal, Dhruv & Herhausen, Dennis & Ludwig, Stephan & Villarroel Ordenes, Francisco, 2022. "The Future of Digital Communication Research: Considering Dynamics and Multimodality," Journal of Retailing, Elsevier, vol. 98(2), pages 224-240.
    9. Argyris, Young Anna & Muqaddam, Aziz & Miller, Steven, 2021. "The effects of the visual presentation of an Influencer's Extroversion on perceived credibility and purchase intentions—moderated by personality matching with the audience," Journal of Retailing and Consumer Services, Elsevier, vol. 59(C).
    10. Siqing Shan & Qi Yan & Yigang Wei, 2020. "Infectious or Recovered? Optimizing the Infectious Disease Detection Process for Epidemic Control and Prevention Based on Social Media," IJERPH, MDPI, vol. 17(18), pages 1-25, September.
    11. Chenshuo Sun & Panagiotis Adamopoulos & Anindya Ghose & Xueming Luo, 2022. "Predicting Stages in Omnichannel Path to Purchase: A Deep Learning Model," Information Systems Research, INFORMS, vol. 33(2), pages 429-445, June.
    12. Kraus, Mathias & Feuerriegel, Stefan & Oztekin, Asil, 2020. "Deep learning in business analytics and operations research: Models, applications and managerial implications," European Journal of Operational Research, Elsevier, vol. 281(3), pages 628-641.
    13. Xiaoxi Zhu & Changhui Yang & Kai Liu & Rui Zhang & Qingquan Jiang, 2022. "Cooperation and decision making in a two-sided market motivated by the externality of a third-party social media platform," Annals of Operations Research, Springer, vol. 316(1), pages 117-142, September.
    14. Vilma Todri, 2022. "Frontiers: The Impact of Ad-Blockers on Online Consumer Behavior," Marketing Science, INFORMS, vol. 41(1), pages 7-18, January.
    15. Konstantin Bauman & Alexander Tuzhilin, 2022. "Know Thy Context: Parsing Contextual Information from User Reviews for Recommendation Purposes," Information Systems Research, INFORMS, vol. 33(1), pages 179-202, March.
    16. Arslan Aziz & Hui Li & Rahul Telang, 2023. "The Consequences of Rating Inflation on Platforms: Evidence from a Quasi-Experiment," Information Systems Research, INFORMS, vol. 34(2), pages 590-608, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijds:v:1:y:2022:i:1:p:81-95. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.