IDEAS home Printed from https://ideas.repec.org/a/spr/jcsosc/v6y2023i2d10.1007_s42001-023-00224-9.html
   My bibliography  Save this article

Transfer learning for hate speech detection in social media

Author

Listed:
  • Lanqin Yuan

    (University of Technology Sydney)

  • Tianyu Wang

    (The Australian National University)

  • Gabriela Ferraro

    (The Australian National University)

  • Hanna Suominen

    (The Australian National University
    University of Turku (UTU))

  • Marian-Andrei Rizoiu

    (University of Technology Sydney
    The Australian National University)

Abstract

Today, the internet is an integral part of our daily lives, enabling people to be more connected than ever before. However, this greater connectivity and access to information increase exposure to harmful content, such as cyber-bullying and cyber-hatred. Models based on machine learning and natural language offer a way to make online platforms safer by identifying hate speech in web text autonomously. However, the main difficulty is annotating a sufficiently large number of examples to train these models. This paper uses a transfer learning technique to leverage two independent datasets jointly and builds a single representation of hate speech. We build an interpretable two-dimensional visualization tool of the constructed hate speech representation—dubbed the Map of Hate—in which multiple datasets can be projected and comparatively analyzed. The hateful content is annotated differently across the two datasets (racist and sexist in one dataset, hateful and offensive in another). However, the common representation successfully projects the harmless class of both datasets into the same space and can be used to uncover labeling errors (false positives). We also show that the joint representation boosts prediction performances when only a limited amount of supervision is available. These methods and insights hold the potential for safer social media and reduce the need to expose human moderators and annotators to distressing online messaging.

Suggested Citation

  • Lanqin Yuan & Tianyu Wang & Gabriela Ferraro & Hanna Suominen & Marian-Andrei Rizoiu, 2023. "Transfer learning for hate speech detection in social media," Journal of Computational Social Science, Springer, vol. 6(2), pages 1081-1101, October.
  • Handle: RePEc:spr:jcsosc:v:6:y:2023:i:2:d:10.1007_s42001-023-00224-9
    DOI: 10.1007/s42001-023-00224-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s42001-023-00224-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s42001-023-00224-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jcsosc:v:6:y:2023:i:2:d:10.1007_s42001-023-00224-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.