IDEAS home Printed from https://ideas.repec.org/a/spr/jcsosc/v6y2023i2d10.1007_s42001-023-00213-y.html
   My bibliography  Save this article

Detecting science-based health disinformation: a stylometric machine learning approach

Author

Listed:
  • Jason A. Williams

    (Augusta University)

  • Ahmed Aleroud

    (Augusta University)

  • Danielle Zimmerman

    (Augusta University)

Abstract

The COVID-19 pandemic showed that misleading scientific health information has become widespread and is challenging to counteract. Some of this disinformation comes from modification of medical research results. This paper investigates how humans create health disinformation through controlled changes of text from abstracts of peer-reviewed COVID-19 research papers. We also developed a machine learning model that used statement embeddings, readability, and text quality features to create datasets that contain falsified scientific statements. We then created machine learning classification models to identify statements containing disinformation. Our results reveal the importance of readability metrics and information quality features in identifying which statements were falsified. We show that text embeddings and semantic similarity do not yield a high detection rate of true/falsified statements compared to using information quality and readability features.

Suggested Citation

  • Jason A. Williams & Ahmed Aleroud & Danielle Zimmerman, 2023. "Detecting science-based health disinformation: a stylometric machine learning approach," Journal of Computational Social Science, Springer, vol. 6(2), pages 817-843, October.
  • Handle: RePEc:spr:jcsosc:v:6:y:2023:i:2:d:10.1007_s42001-023-00213-y
    DOI: 10.1007/s42001-023-00213-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s42001-023-00213-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s42001-023-00213-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Danielle Caled & Mário J. Silva, 2022. "Digital media and misinformation: An outlook on multidisciplinary strategies against manipulation," Journal of Computational Social Science, Springer, vol. 5(1), pages 123-159, May.
    2. Lina Zhou & Judee K. Burgoon & Jay F. Nunamaker & Doug Twitchell, 2004. "Automating Linguistics-Based Cues for Detecting Deception in Text-Based Asynchronous Computer-Mediated Communications," Group Decision and Negotiation, Springer, vol. 13(1), pages 81-106, January.
    3. Lorenz Graf-Vlachy, 2022. "Is the readability of abstracts decreasing in management research?," Review of Managerial Science, Springer, vol. 16(4), pages 1063-1084, May.
    4. Brian G. Southwell & J. Scott Babwah Brennen & Ryan Paquin & Vanessa Boudewyns & Jing Zeng, 2022. "Defining and Measuring Scientific Misinformation," The ANNALS of the American Academy of Political and Social Science, , vol. 700(1), pages 98-111, March.
    5. Lisa Singh & Leticia Bode & Ceren Budak & Kornraphop Kawintiranon & Colton Padden & Emily Vraga, 2020. "Understanding high- and low-quality URL Sharing on COVID-19 Twitter streams," Journal of Computational Social Science, Springer, vol. 3(2), pages 343-366, November.
    6. Wentao Xu & Kazutoshi Sasahara, 2022. "Correction to: Characterizing the roles of bots on Twitter during the COVID‑19 infodemic," Journal of Computational Social Science, Springer, vol. 5(1), pages 1095-1095, May.
    7. Wentao Xu & Kazutoshi Sasahara, 2022. "Characterizing the roles of bots on Twitter during the COVID-19 infodemic," Journal of Computational Social Science, Springer, vol. 5(1), pages 591-609, May.
    8. Umair Majid & Aghna Wasim & Judy Truong & Simran Bakshi, 2021. "Public trust in governments, health care providers, and the media during pandemics: A systematic review," Journal of Trust Research, Taylor & Francis Journals, vol. 11(2), pages 119-141, July.
    9. Justin Farrell & Kathryn McConnell & Robert Brulle, 2019. "Evidence-based strategies to combat scientific misinformation," Nature Climate Change, Nature, vol. 9(3), pages 191-195, March.
    10. Alexander J. Stewart & Mohsen Mosleh & Marina Diakonova & Antonio A. Arechar & David G. Rand & Joshua B. Plotkin, 2019. "Information gerrymandering and undemocratic decisions," Nature, Nature, vol. 573(7772), pages 117-121, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ali Unlu & Sophie Truong & Nitin Sawhney & Jonas Sivelä & Tuukka Tammi, 2024. "Long-term assessment of social amplification of risk during COVID-19: challenges to public health agencies amid misinformation and vaccine stance," Journal of Computational Social Science, Springer, vol. 7(1), pages 809-836, April.
    2. David Klenert & Franziska Funke & Linus Mattauch & Brian O’Callaghan, 2020. "Five Lessons from COVID-19 for Advancing Climate Change Mitigation," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 76(4), pages 751-778, August.
    3. repec:cup:judgdm:v:16:y:2021:i:6:p:1413-1438 is not listed on IDEAS
    4. Lackner, Teresa & Fierro, Luca E. & Mellacher, Patrick, 2025. "Opinion dynamics meet agent-based climate economics: An integrated analysis of carbon taxation," Journal of Economic Behavior & Organization, Elsevier, vol. 229(C).
    5. Mohsen Mosleh & Alexander J. Stewart & Joshua B. Plotkin & David G. Rand, 2020. "Prosociality in the economic Dictator Game is associated with less parochialism and greater willingness to vote for intergroup compromise," Judgment and Decision Making, Society for Judgment and Decision Making, vol. 15(1), pages 1-6, January.
    6. Sugandha Srivastav & Ryan Rafaty, 2023. "Political Strategies to Overcome Climate Policy Obstructionism," Papers 2304.14960, arXiv.org.
    7. Emilio Ferrara & Stefano Cresci & Luca Luceri, 2020. "Misinformation, manipulation, and abuse on social media in the era of COVID-19," Journal of Computational Social Science, Springer, vol. 3(2), pages 271-277, November.
    8. Xi Zhao & Li Li & Wei Xiao, 2023. "The diachronic change of research article abstract difficulty across disciplines: a cognitive information-theoretic approach," Palgrave Communications, Palgrave Macmillan, vol. 10(1), pages 1-12, December.
    9. Siri Frisli, 2025. "Semi-supervised self-training for COVID-19 misinformation detection: analyzing Twitter data and alternative news media on Norwegian Twitter," Journal of Computational Social Science, Springer, vol. 8(2), pages 1-34, May.
    10. Michael T. Braun & Lyn M. Swol, 2016. "Justifications Offered, Questions Asked, and Linguistic Patterns in Deceptive and Truthful Monetary Interactions," Group Decision and Negotiation, Springer, vol. 25(3), pages 641-661, May.
    11. Alabrese, Eleonora & Capozza, Francesco & Garg, Prashant, 2024. "Politicized Scientists: Credibility Cost of Political Expression on Twitter," CAGE Online Working Paper Series 735, Competitive Advantage in the Global Economy (CAGE).
    12. Zehrer, Anita & Crotts, John C. & Magnini, Vincent P., 2011. "The perceived usefulness of blog postings: An extension of the expectancy-disconfirmation paradigm," Tourism Management, Elsevier, vol. 32(1), pages 106-113.
    13. Oana Brindusa Albu & Lars Thøger Christensen, 2024. "Shadows in the Spotlight: Navigating Organizational Transparency in Digital Contexts," Schmalenbach Journal of Business Research, Springer, vol. 76(4), pages 641-659, December.
    14. Weiqiao Liu & Jianjun Zhu & Peide Liu & Peng Wang & Wen Song, 2023. "A Linguistic Cloud-Based Consensus Framework with Three Behavior Classifications Under Trust-Interest Relations," Group Decision and Negotiation, Springer, vol. 32(6), pages 1497-1533, December.
    15. Patricia Nayna Schwerdtle & Edwige Cavan & Lukas Pilz & Silvio Daniele Oggioni & Arianna Crosta & Veranika Kaleyeva & Peshang Hama Karim & Filip Szarvas & Tobiasz Naryniecki & Maximilian Jungmann, 2023. "Interlinkages between Climate Change Impacts, Public Attitudes, and Climate Action—Exploring Trends before and after the Paris Agreement in the EU," Sustainability, MDPI, vol. 15(9), pages 1-19, May.
    16. Matthew I. Jones & Antonio D. Sirianni & Feng Fu, 2022. "Polarization, abstention, and the median voter theorem," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-12, December.
    17. Christoph Aymanns & Jakob Foerster & Co-Pierre Georg & Matthias Weber, 2022. "Fake News in Social Networks," Swiss Finance Institute Research Paper Series 22-58, Swiss Finance Institute.
    18. Zhang, Hong, 2022. "Effects of stubborn players and noise on the evolution of cooperation in spatial prisoner’s dilemma game," Chaos, Solitons & Fractals, Elsevier, vol. 165(P1).
    19. Hugo Queiroz Abonizio & Janaina Ignacio de Morais & Gabriel Marques Tavares & Sylvio Barbon Junior, 2020. "Language-Independent Fake News Detection: English, Portuguese, and Spanish Mutual Features," Future Internet, MDPI, vol. 12(5), pages 1-18, May.
    20. Katharina Baum & Annika Baumann & Katharina Batzel, 2024. "Investigating Innovation Diffusion in Gender-Specific Medicine: Insights from Social Network Analysis," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 66(3), pages 335-355, June.
    21. Wim Naudé, 2024. "Destructive digital entrepreneurship," Chapters, in: Wim Naudé & Bernadette Power (ed.), Handbook of Research on Entrepreneurship and Conflict, chapter 17, pages 292-328, Edward Elgar Publishing.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jcsosc:v:6:y:2023:i:2:d:10.1007_s42001-023-00213-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.