Author
Listed:
- Habibi, Mahyar
(Department of Economics, Bocconi University)
- Hovy, Dirk
(Department of Computing Sciences, Bocconi University)
- Schwarz, Carlo
(Department of Economics, Bocconi University)
Abstract
There is an ongoing debate about how to moderate toxic speech on social media and the impact of content moderation on online discourse. This paper proposes and validates a methodology for measuring the content-moderation-induced distortions in online discourse using text embeddings from computational linguistics. Applying the method to a representative sample of 5 million US political Tweets, we find that removing toxic Tweets significantly alters the semantic composition of content. The magnitudes of the distortions are comparable to removing 4 out of 67 topics from the online discourse at random. This finding is consistent across different embedding models, toxicity metrics, and samples. Importantly, we demonstrate that these effects are not solely driven by toxic language but by the removal of topics often expressed in toxic form. We propose an alternative approach to content moderation that uses generative Large Language Models to rephrase toxic Tweets, preserving their salvageable content rather than removing them entirely. We show that this rephrasing strategy reduces toxicity while mitigating distortions in online content.
Suggested Citation
Habibi, Mahyar & Hovy, Dirk & Schwarz, Carlo, 2026.
"The Content Moderator’s Dilemma: Removal of Toxic Content and Distortions to Online Discourse,"
CAGE Online Working Paper Series
793, Competitive Advantage in the Global Economy (CAGE).
Handle:
RePEc:cge:wacage:793
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cge:wacage:793. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Jane Snape (email available below). General contact details of provider: https://edirc.repec.org/data/dewaruk.html .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.