IDEAS home Printed from https://ideas.repec.org/p/osf/socarx/t2dbj.html
   My bibliography  Save this paper

Uncovering digital trace data biases: tracking undercoverage in web tracking data

Author

Listed:
  • Bosch, Oriol J.

    (The London School of Economics and Political Science)

  • Sturgis, Patrick
  • Kuha, Jouni
  • Revilla, Melanie

Abstract

In the digital age, understanding people’s online behaviours is vital. Digital trace data has emerged as a popular alternative to surveys, many times hailed as the gold standard. This study critically assesses the use of web tracking data to study online media exposure. Specifically, we focus on a critical error source of this type of data, tracking undercoverage: researchers’ failure to capture data from all the devices and browsers that individuals utilize to go online. Using data from Spain, Portugal, and Italy, we explore undercoverage in commercial online panels and simulate biases in online media exposure estimates. The paper shows that tracking undercoverage is highly prevalent when using commercial panels, with more than 70% of participants affected. In addition, the primary determinant of undercoverage is the type and number of devices employed for internet access, rather than individual characteristics and attitudes. Additionally, through a simulation study, it demonstrates that web tracking estimates, both univariate and multivariate, are often substantially biased due to tracking undercoverage. This represent the first empirical evidence demonstrating that web tracking data is, effectively, biased. Methodologically, the paper showcases how survey questions can be used as auxiliary information to identify and simulate web tracking errors.

Suggested Citation

  • Bosch, Oriol J. & Sturgis, Patrick & Kuha, Jouni & Revilla, Melanie, 2023. "Uncovering digital trace data biases: tracking undercoverage in web tracking data," SocArXiv t2dbj, Center for Open Science.
  • Handle: RePEc:osf:socarx:t2dbj
    DOI: 10.31219/osf.io/t2dbj
    as

    Download full text from publisher

    File URL: https://osf.io/download/6521304251280101f22a4412/
    Download Restriction: no

    File URL: https://libkey.io/10.31219/osf.io/t2dbj?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Albert Padró-Solanet & Joan Balcells, 2022. "Media Diet and Polarisation: Evidence from Spain," South European Society and Politics, Taylor & Francis Journals, vol. 27(1), pages 75-95, January.
    2. Andrew M. Guess & Brendan Nyhan & Jason Reifler, 2020. "Exposure to untrustworthy websites in the 2016 US election," Nature Human Behaviour, Nature, vol. 4(5), pages 472-480, May.
    3. Ridhi Kashyap & Masoomali Fatehkia & Reham Al Tamime & Ingmar Weber, 2020. "Monitoring global digital gender inequality using the online populations of Facebook and Google," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 43(27), pages 779-816.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sergei Guriev & Elias Papaioannou, 2022. "The Political Economy of Populism," Journal of Economic Literature, American Economic Association, vol. 60(3), pages 753-832, September.
    2. Nyabuti Damaris Kemunto & Prof. Hezron Mogambi & Dr. Anita Kiamba, 2023. "Foreign Policy Disinformation: Fueling Polarization and Deterioration of the Public Sphere in Kenya," International Journal of Research and Innovation in Social Science, International Journal of Research and Innovation in Social Science (IJRISS), vol. 7(8), pages 425-442, August.
    3. Philipp Lorenz-Spreen & Stephan Lewandowsky & Cass R. Sunstein & Ralph Hertwig, 2020. "How behavioural sciences can promote truth, autonomy and democratic discourse online," Nature Human Behaviour, Nature, vol. 4(11), pages 1102-1109, November.
    4. André Grow & Daniela Perrotta & Emanuele Del Fava & Jorge Cimentada & Francesco Rampazzo & B. Sofia Gil-Clavel & Emilio Zagheni & René D. Flores & Ilana Ventura & Ingmar G. Weber, 2021. "How reliable is Facebook’s advertising data for use in social science research? Insights from a cross-national online survey," MPIDR Working Papers WP-2021-006, Max Planck Institute for Demographic Research, Rostock, Germany.
    5. Saumya Bhadani & Shun Yamaya & Alessandro Flammini & Filippo Menczer & Giovanni Luca Ciampaglia & Brendan Nyhan, 2022. "Political audience diversity and news reliability in algorithmic ranking," Nature Human Behaviour, Nature, vol. 6(4), pages 495-505, April.
    6. Gregory Eady & Tom Paskhalis & Jan Zilinsky & Richard Bonneau & Jonathan Nagler & Joshua A. Tucker, 2023. "Exposure to the Russian Internet Research Agency foreign influence campaign on Twitter in the 2016 US election and its relationship to attitudes and voting behavior," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    7. Daniel Muise & Nilam Ram & Thomas Robinson & Byron Reeves, 2023. "Identification, Impacts, and Opportunities of Three Common Measurement Considerations when using Digital Trace Data," Papers 2310.00197, arXiv.org.
    8. repec:cup:judgdm:v:16:y:2021:i:2:p:484-504 is not listed on IDEAS
    9. Robert M. Ross & David G. Rand & Gordon Pennycook, 2021. "Beyond “fake news†: Analytic thinking and the detection of false and hyperpartisan news headlines," Judgment and Decision Making, Society for Judgment and Decision Making, vol. 16(2), pages 484-504, March.
    10. Alexander J. Stewart & Antonio A. Arechar & David G. Rand & Joshua B. Plotkin, 2021. "The Game Theory of Fake News," Papers 2108.13687, arXiv.org, revised Sep 2023.
    11. Meng Zhen Larsen & Michael R. Haupt & Tiana McMann & Raphael E. Cuomo & Tim K. Mackey, 2023. "The Influence of News Consumption Habits and Dispositional Traits on Trust in Medical Scientists," IJERPH, MDPI, vol. 20(10), pages 1-13, May.
    12. João Pedro Baptista & Anabela Gradim, 2020. "Understanding Fake News Consumption: A Review," Social Sciences, MDPI, vol. 9(10), pages 1-22, October.
    13. Mohsen Mosleh & David G. Rand, 2022. "Measuring exposure to misinformation from political elites on Twitter," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    14. Brian Hughes & Kesa White & Jennifer West & Meili Criezis & Cindy Zhou & Sarah Bartholomew, 2021. "Cultural Variance in Reception and Interpretation of Social Media COVID-19 Disinformation in French-Speaking Regions," IJERPH, MDPI, vol. 18(23), pages 1-28, November.
    15. Jana Lasser & Segun T. Aroyehun & Fabio Carrella & Almog Simchon & David Garcia & Stephan Lewandowsky, 2023. "From alternative conceptions of honesty to alternative facts in communications by US politicians," Nature Human Behaviour, Nature, vol. 7(12), pages 2140-2151, December.
    16. André Grow & Daniela Perrotta & Emanuele Del Fava & Jorge Cimentada & Francesco Rampazzo & Sofia Gil‐Clavel & Emilio Zagheni & René D. Flores & Ilana Ventura & Ingmar Weber, 2022. "Is Facebook's advertising data accurate enough for use in social science research? Insights from a cross‐national online survey," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(S2), pages 343-363, December.
    17. Fabio Padovano & Pauline Mille, 2023. "Education, fake news and the Political Budget Cycle," Economics Working Paper from Condorcet Center for political Economy at CREM-CNRS 2023-01-ccr, Condorcet Center for political Economy.
    18. Alfredo Guzmán Rincón & Sandra Barragán Moreno & Belén Rodríguez-Canovas & Ruby Lorena Carrillo Barbosa & David Ricardo Africano Franco, 2023. "Social networks, disinformation and diplomacy: a dynamic model for a current problem," Palgrave Communications, Palgrave Macmillan, vol. 10(1), pages 1-14, December.
    19. Ryan C. Moore & Ross Dahlke & Jeffrey T. Hancock, 2023. "Exposure to untrustworthy websites in the 2020 US election," Nature Human Behaviour, Nature, vol. 7(7), pages 1096-1105, July.
    20. Gordon Pennycook & David G. Rand, 2022. "Accuracy prompts are a replicable and generalizable approach for reducing the spread of misinformation," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    21. Christopher Adamo & Jeffrey Carpenter, 2023. "Sentiment and the belief in fake news during the 2020 presidential primaries," Oxford Open Economics, Oxford University Press, vol. 2, pages 512-547.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:osf:socarx:t2dbj. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: OSF (email available below). General contact details of provider: https://arabixiv.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.