IDEAS home Printed from https://ideas.repec.org/a/spr/jcsosc/v8y2025i2d10.1007_s42001-025-00367-x.html
   My bibliography  Save this article

Semi-supervised self-training for COVID-19 misinformation detection: analyzing Twitter data and alternative news media on Norwegian Twitter

Author

Listed:
  • Siri Frisli

    (Oslo Metropolitan University)

Abstract

This paper investigates the dissemination of COVID-19 misinformation on Twitter within the context of the Norwegian media landscape, characterized by high levels of trust in the media, yet experiencing an increasing influence of alternative news sources. Using a semi-supervised self-training approach for text classification, a dataset of 426,262 tweets is analyzed, identifying approximately 5.11% as misinformation. The study reveals that misinformation tweets receive heightened engagement, particularly in retweets, and originate predominantly from a small group of users. Furthermore, while misinformation tweets are more likely to link to alternative news media sites, these sites represent only a minor fraction of the overall links shared. The analysis highlights distinct temporal patterns, with misinformation activity spiking during significant events such as the arrival of COVID-19 vaccines in Norway and the emergence of the Omicron variant. This research underscores the complexity of misinformation dynamics in a high-trust media environment and emphasizes the need for effective strategies to combat misinformation, particularly from alternative news media that challenge conventional narratives while often propagating falsehoods. Overall, the findings contribute valuable insights into the interplay between social media, alternative news sources, and misinformation dissemination during a global pandemic.

Suggested Citation

  • Siri Frisli, 2025. "Semi-supervised self-training for COVID-19 misinformation detection: analyzing Twitter data and alternative news media on Norwegian Twitter," Journal of Computational Social Science, Springer, vol. 8(2), pages 1-34, May.
  • Handle: RePEc:spr:jcsosc:v:8:y:2025:i:2:d:10.1007_s42001-025-00367-x
    DOI: 10.1007/s42001-025-00367-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s42001-025-00367-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s42001-025-00367-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Salman Bin Naeem & Maged N. Kamel Boulos, 2021. "COVID-19 Misinformation Online and Health Literacy: A Brief Overview," IJERPH, MDPI, vol. 18(15), pages 1-12, July.
    2. Cantay Caliskan & Alaz Kilicaslan, 2023. "Varieties of corona news: a cross-national study on the foundations of online misinformation production during the COVID-19 pandemic," Journal of Computational Social Science, Springer, vol. 6(1), pages 191-243, April.
    3. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    4. Lisa Singh & Leticia Bode & Ceren Budak & Kornraphop Kawintiranon & Colton Padden & Emily Vraga, 2020. "Understanding high- and low-quality URL Sharing on COVID-19 Twitter streams," Journal of Computational Social Science, Springer, vol. 3(2), pages 343-366, November.
    5. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Foutzopoulos, Giorgos & Pandis, Nikolaos & Tsagris, Michail, 2024. "Predicting full retirement attainment of NBA players," MPRA Paper 121540, University Library of Munich, Germany.
    2. Van Belle, Jente & Guns, Tias & Verbeke, Wouter, 2021. "Using shared sell-through data to forecast wholesaler demand in multi-echelon supply chains," European Journal of Operational Research, Elsevier, vol. 288(2), pages 466-479.
    3. Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Jun 2024.
    4. Michael Bucker & Gero Szepannek & Alicja Gosiewska & Przemyslaw Biecek, 2020. "Transparency, Auditability and eXplainability of Machine Learning Models in Credit Scoring," Papers 2009.13384, arXiv.org.
    5. Jian Lu & Raheel Ahmad & Thomas Nguyen & Jeffrey Cifello & Humza Hemani & Jiangyuan Li & Jinguo Chen & Siyi Li & Jing Wang & Achouak Achour & Joseph Chen & Meagan Colie & Ana Lustig & Christopher Dunn, 2022. "Heterogeneity and transcriptome changes of human CD8+ T cells across nine decades of life," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    6. Bennett, Donyetta & Mekelburg, Erik & Strauss, Jack & Williams, T.H., 2024. "Unlocking the black box of sentiment and cryptocurrency: What, which, why, when and how?," Global Finance Journal, Elsevier, vol. 60(C).
    7. Fogliato Riccardo & Oliveira Natalia L. & Yurko Ronald, 2021. "TRAP: a predictive framework for the Assessment of Performance in Trail Running," Journal of Quantitative Analysis in Sports, De Gruyter, vol. 17(2), pages 129-143, June.
    8. Yadid M. Algavi & Elhanan Borenstein, 2023. "A data-driven approach for predicting the impact of drugs on the human microbiome," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    9. Lundberg, Ian & Brand, Jennie E. & Jeon, Nanum, 2022. "Researcher reasoning meets computational capacity: Machine learning for social science," SocArXiv s5zc8, Center for Open Science.
    10. Ma, Shaohui & Fildes, Robert, 2021. "Retail sales forecasting with meta-learning," European Journal of Operational Research, Elsevier, vol. 288(1), pages 111-128.
    11. Florian Pargent & Florian Pfisterer & Janek Thomas & Bernd Bischl, 2022. "Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features," Computational Statistics, Springer, vol. 37(5), pages 2671-2692, November.
    12. Vanessa Ress & Eva‐Maria Wild, 2024. "The impact of integrated care on health care utilization and costs in a socially deprived urban area in Germany: A difference‐in‐differences approach within an event‐study framework," Health Economics, John Wiley & Sons, Ltd., vol. 33(2), pages 229-247, February.
    13. Satre-Meloy, Aven & Diakonova, Marina & Grünewald, Philipp, 2020. "Cluster analysis and prediction of residential peak demand profiles using occupant activity data," Applied Energy, Elsevier, vol. 260(C).
    14. Andree,Bo Pieter Johannes & Chamorro Elizondo,Andres Fernando & Kraay,Aart C. & Spencer,Phoebe Girouard & Wang,Dieter, 2020. "Predicting Food Crises," Policy Research Working Paper Series 9412, The World Bank.
    15. Fitzpatrick, Trevor & Mues, Christophe, 2021. "How can lenders prosper? Comparing machine learning approaches to identify profitable peer-to-peer loan investments," European Journal of Operational Research, Elsevier, vol. 294(2), pages 711-722.
    16. Nicole Ellenbach & Anne-Laure Boulesteix & Bernd Bischl & Kristian Unger & Roman Hornung, 2021. "Improved Outcome Prediction Across Data Sources Through Robust Parameter Tuning," Journal of Classification, Springer;The Classification Society, vol. 38(2), pages 212-231, July.
    17. Gianluca De Nard & Simon Hediger & Markus Leippold, 2022. "Subsampled factor models for asset pricing: The rise of Vasa," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(6), pages 1217-1247, September.
    18. Y Cui & E J Tchetgen Tchetgen, 2024. "Selective machine learning of doubly robust functionals," Biometrika, Biometrika Trust, vol. 111(2), pages 517-535.
    19. Paul, Alexander & Bleses, Dorthe & Rosholm, Michael, 2020. "Efficient Targeting in Childhood Interventions," IZA Discussion Papers 12989, Institute of Labor Economics (IZA).
    20. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jcsosc:v:8:y:2025:i:2:d:10.1007_s42001-025-00367-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.