Author
Listed:
- Kin Wai Ng
- Nathan Wendt
- Jasmine Eshun
- Emily Saldanha
Abstract
Viewpoint detection is a crucial step for characterizing online narratives and understanding their diffusion and evolution across online social spaces. Existing methods for semantic understanding of online posts have shown strong progress towards grouping documents with similar topics but struggle to differentiate between different viewpoints towards those topics. The purpose of this work is to infuse semantic embedding spaces with improved viewpoint information under the constraint of low data availability. To address this task, we develop a novel weakly supervised contrastive learning approach that leverages social proximity of users as a self-supervised signal of shared viewpoint likelihood. We demonstrate the utility of this method on a use case of X (formerly Twitter) discussion related to COVID-19. We show that fine-tuned embeddings which were trained to predict social proximity signals present in retweet networks demonstrate the capability to infuse learned embeddings with viewpoint information. Finally, we demonstrate that these viewpoint-infused embeddings show improved effectiveness at identifying clusters of tweets with shared viewpoints and topics when used in a topic modeling pipeline. Such viewpoint-infused embeddings have strong potential to support multiple semantic reasoning tasks including topic modeling, stance detection, and narrative detection.Author summary: Information that spreads on social media can impact public opinion and real-world outcomes such as health behaviors. To understand these effects, researchers need tools to analyze large volumes of text and detect coherent narratives from disparate sets of posts. One prominent method for understanding such patterns is to represent posts in a mathematical embedding space, where posts with similar meanings appear close together. While existing methods succeed at grouping social media text by topic, they often fail to distinguish posts that have opposing viewpoints on the same topic. In this paper, we develop an approach to improve embedding spaces for social media text data by using information from the social interactions of users on the social media platform. By leveraging the fact that users who interact with each other are more likely to agree, this information helps distinguish conflicting views. Using a case study of social media discussion related to COVID-19 vaccination, we demonstrate that our novel approach leads to an improved ability to separate opposing viewpoints, providing a strong basis for narrative discovery within large social media datasets.
Suggested Citation
Kin Wai Ng & Nathan Wendt & Jasmine Eshun & Emily Saldanha, 2026.
"Weakly supervised contrastive representation learning to encode narrative viewpoint of COVID-19 tweets,"
PLOS Complex Systems, Public Library of Science, vol. 3(2), pages 1-20, February.
Handle:
RePEc:plo:pcsy00:0000089
DOI: 10.1371/journal.pcsy.0000089
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcsy00:0000089. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: complexsystem (email available below). General contact details of provider: https://journals.plos.org/complexsystems/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.