IDEAS home Printed from https://ideas.repec.org/a/spr/jcsosc/v6y2023i2d10.1007_s42001-023-00224-9.html
   My bibliography  Save this article

Transfer learning for hate speech detection in social media

Author

Listed:
  • Lanqin Yuan

    (University of Technology Sydney)

  • Tianyu Wang

    (The Australian National University)

  • Gabriela Ferraro

    (The Australian National University)

  • Hanna Suominen

    (The Australian National University
    University of Turku (UTU))

  • Marian-Andrei Rizoiu

    (University of Technology Sydney
    The Australian National University)

Abstract

Today, the internet is an integral part of our daily lives, enabling people to be more connected than ever before. However, this greater connectivity and access to information increase exposure to harmful content, such as cyber-bullying and cyber-hatred. Models based on machine learning and natural language offer a way to make online platforms safer by identifying hate speech in web text autonomously. However, the main difficulty is annotating a sufficiently large number of examples to train these models. This paper uses a transfer learning technique to leverage two independent datasets jointly and builds a single representation of hate speech. We build an interpretable two-dimensional visualization tool of the constructed hate speech representation—dubbed the Map of Hate—in which multiple datasets can be projected and comparatively analyzed. The hateful content is annotated differently across the two datasets (racist and sexist in one dataset, hateful and offensive in another). However, the common representation successfully projects the harmless class of both datasets into the same space and can be used to uncover labeling errors (false positives). We also show that the joint representation boosts prediction performances when only a limited amount of supervision is available. These methods and insights hold the potential for safer social media and reduce the need to expose human moderators and annotators to distressing online messaging.

Suggested Citation

  • Lanqin Yuan & Tianyu Wang & Gabriela Ferraro & Hanna Suominen & Marian-Andrei Rizoiu, 2023. "Transfer learning for hate speech detection in social media," Journal of Computational Social Science, Springer, vol. 6(2), pages 1081-1101, October.
  • Handle: RePEc:spr:jcsosc:v:6:y:2023:i:2:d:10.1007_s42001-023-00224-9
    DOI: 10.1007/s42001-023-00224-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s42001-023-00224-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s42001-023-00224-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Yin Paradies & Jehonathan Ben & Nida Denson & Amanuel Elias & Naomi Priest & Alex Pieterse & Arpana Gupta & Margaret Kelaher & Gilbert Gee, 2015. "Racism as a Determinant of Health: A Systematic Review and Meta-Analysis," PLOS ONE, Public Library of Science, vol. 10(9), pages 1-48, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Erkan Gunes & Christoffer Koch Florczak, 2025. "Replacing or enhancing the human coder? Multiclass classification of policy documents with large language models," Journal of Computational Social Science, Springer, vol. 8(2), pages 1-20, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yuqi Wang & Laurent Reyes & Emily A. Greenfield & Sarah R. Allred, 2022. "Municipal Ethnic Composition and Disparities in COVID-19 Infections in New Jersey: A Blinder–Oaxaca Decomposition Analysis," IJERPH, MDPI, vol. 19(21), pages 1-25, October.
    2. Malat, Jennifer & Mayorga-Gallo, Sarah & Williams, David R., 2018. "The effects of whiteness on the health of whites in the USA," Social Science & Medicine, Elsevier, vol. 199(C), pages 148-156.
    3. Hamed, Sarah & Bradby, Hannah & Thapar-Björkert, Suruchi & Ahlberg, Beth Maina, 2024. "Healthcare staff's racialized talk: The perpetuation of racism in healthcare," Social Science & Medicine, Elsevier, vol. 355(C).
    4. Caryn N. Bell & Jordan Kerr & Jessica L. Young, 2019. "Associations between Obesity, Obesogenic Environments, and Structural Racism Vary by County-Level Racial Composition," IJERPH, MDPI, vol. 16(5), pages 1-17, March.
    5. Ricci B Harris & James Stanley & Donna M Cormack, 2018. "Racism and health in New Zealand: Prevalence over time and associations between recent experience of racism and health and wellbeing measures using national survey data," PLOS ONE, Public Library of Science, vol. 13(5), pages 1-22, May.
    6. Nazan Ulusoy & Anja Schablon, 2020. "Discrimination in In-Patient Geriatric Care: A Qualitative Study on the Experiences of Employees with a Turkish Migration Background," IJERPH, MDPI, vol. 17(7), pages 1-14, March.
    7. Lubna Rashid & Silvia Cepeda-García, 2021. "Self-Categorising and Othering in Migrant Integration: The Case of Entrepreneurs in Berlin," Sustainability, MDPI, vol. 13(4), pages 1-14, February.
    8. Uršula Lipovec Čebron, 2021. "Language as a Trigger for Racism: Language Barriers at Healthcare Institutions in Slovenia," Social Sciences, MDPI, vol. 10(4), pages 1-17, March.
    9. D'Costa, Ieta & Truong, Mandy & Russell, Lynette & Adams, Karen, 2023. "Employee perceptions of race and racism in an Australian hospital," Social Science & Medicine, Elsevier, vol. 339(C).
    10. Bastos, João L. & Harnois, Catherine E. & Paradies, Yin C., 2018. "Health care barriers, racism, and intersectionality in Australia," Social Science & Medicine, Elsevier, vol. 199(C), pages 209-218.
    11. Jill Furzer & Boriana Miloucheva, 2020. "The Long Arm of the Clean Air Act: Pollution Abatement and COVID-19 Racial Disparities," Working Papers tecipa-668, University of Toronto, Department of Economics.
    12. Chen, Shanting & Mallory, Allen B., 2021. "The effect of racial discrimination on mental and physical health: A propensity score weighting approach," Social Science & Medicine, Elsevier, vol. 285(C).
    13. Rachel Hennein & Jessica Bonumwezi & Max Jordan Nguemeni Tiako & Petty Tineo & Sarah R. Lowe, 2021. "Racial and Gender Discrimination Predict Mental Health Outcomes among Healthcare Workers Beyond Pandemic-Related Stressors: Findings from a Cross-Sectional Survey," IJERPH, MDPI, vol. 18(17), pages 1-14, September.
    14. Ana Isabel Maldonado & Carol B. Cunradi & Anna María Nápoles, 2020. "Racial/Ethnic Discrimination and Intimate Partner Violence Perpetration in Latino Men: The Mediating Effects of Mental Health," IJERPH, MDPI, vol. 17(21), pages 1-17, November.
    15. Susan B. Sisson & Adrien Malek-Lasater & Timothy G. Ford & Diane Horm & Kyong-Ah Kwon, 2023. "Predictors of Overweight and Obesity in Early Care and Education Teachers during COVID-19," IJERPH, MDPI, vol. 20(3), pages 1-16, February.
    16. World Bank, 2024. "Examining Racism and Discrimination in the Middle East and North Africa Region," World Bank Publications - Reports 42028, The World Bank Group.
    17. Katrina D Hopkins & Carrington C J Shepherd & Catherine L Taylor & Stephen R Zubrick, 2015. "Relationships between Psychosocial Resilience and Physical Health Status of Western Australian Urban Aboriginal Youth," PLOS ONE, Public Library of Science, vol. 10(12), pages 1-16, December.
    18. Gilbert, Paul A. & Zemore, Sarah E., 2016. "Discrimination and drinking: A systematic review of the evidence," Social Science & Medicine, Elsevier, vol. 161(C), pages 178-194.
    19. Priest, Naomi & Doery, Kate & Lim, Chiao Kee & Lawrence, Jourdyn A. & Zoumboulis, Georgia & King, Gabriella & Lamisa, Dewan & He, Fan & Wijesuriya, Rushani & Mateo, Camila M. & Chong, Shiau & Truong, , 2024. "Racism and health and wellbeing among children and youth–An updated systematic review and meta-analysis," Social Science & Medicine, Elsevier, vol. 361(C).
    20. Bernard Baffour & Sumonkanti Das & Mu Li & Alice Richardson, 2024. "The Utility of Socioeconomic and Remoteness Indicators in Understanding the Geographical Variation in the Regional Prevalence of Early Childhood Vulnerability in Australia," Child Indicators Research, Springer;The International Society of Child Indicators (ISCI), vol. 17(4), pages 1791-1827, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jcsosc:v:6:y:2023:i:2:d:10.1007_s42001-023-00224-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.