IDEAS home Printed from https://ideas.repec.org/a/spr/qualqt/v56y2022i4d10.1007_s11135-021-01252-1.html
   My bibliography  Save this article

Ensuring survey research data integrity in the era of internet bots

Author

Listed:
  • Marybec Griffin

    (Rutgers University
    Rutgers University)

  • Richard J. Martino

    (Rutgers University)

  • Caleb LoSchiavo

    (Rutgers University
    Rutgers University)

  • Camilla Comer-Carruthers

    (Rutgers University
    Rutgers University)

  • Kristen D. Krause

    (Rutgers University
    Rutgers University)

  • Christopher B. Stults

    (Rutgers University
    City University of New York)

  • Perry N. Halkitis

    (Rutgers University
    Rutgers University
    Rutgers University
    Rutgers University)

Abstract

We used an internet-based survey platform to conduct a cross-sectional survey regarding the impact of COVID-19 on the LGBTQ + population in the United States. While this method of data collection was quick and inexpensive, the data collected required extensive cleaning due to the infiltration of bots. Based on this experience, we provide recommendations for ensuring data integrity. Recruitment conducted between May 7 and 8, 2020 resulted in an initial sample of 1251 responses. The Qualtrics survey was disseminated via social media and professional association listservs. After noticing data discrepancies, research staff developed a rigorous data cleaning protocol. A second wave of recruitment was conducted on June 11–12, 2020 using the original recruitment methods. The five-step data cleaning protocol led to the removal of 773 (61.8%) surveys from the initial dataset, resulting in a sample of 478 participants in the first wave of data collection. The protocol led to the removal of 46 (31.9%) surveys from the second two-day wave of data collection, resulting in a sample of 98 participants in the second wave of data collection. After verifying the two-day pilot process was effective at screening for bots, the survey was reopened for a third wave of data collection resulting in a total of 709 responses, which were identified as an additional 514 (72.5%) valid participants and led to the removal of an additional 194 (27.4%) possible bots. The final analytic sample consists of 1090 participants. Although a useful and efficient research tool, especially among hard-to-reach populations, internet-based research is vulnerable to bots and mischievous responders, despite survey platforms’ built-in protections. Beyond the depletion of research funds, bot infiltration threatens data integrity and may disproportionately harm research with marginalized populations. Based on our experience, we recommend the use of strategies such as qualitative questions, duplicate demographic questions, and incentive raffles to reduce likelihood of mischievous respondents. These protections can be undertaken to ensure data integrity and facilitate research on vulnerable populations.

Suggested Citation

  • Marybec Griffin & Richard J. Martino & Caleb LoSchiavo & Camilla Comer-Carruthers & Kristen D. Krause & Christopher B. Stults & Perry N. Halkitis, 2022. "Ensuring survey research data integrity in the era of internet bots," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(4), pages 2841-2852, August.
  • Handle: RePEc:spr:qualqt:v:56:y:2022:i:4:d:10.1007_s11135-021-01252-1
    DOI: 10.1007/s11135-021-01252-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11135-021-01252-1
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11135-021-01252-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Martine Selm & Nicholas Jankowski, 2006. "Conducting Online Surveys," Quality & Quantity: International Journal of Methodology, Springer, vol. 40(3), pages 435-456, June.
    2. Jeffrey M. Perkel, 2020. "Mischief-making bots attacked my scientific survey," Nature, Nature, vol. 579(7799), pages 461-461, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Morenike Oluwatoyin Folayan & Roberto Ariel Abeldaño Zuniga & Oliver C. Ezechi & Brandon Brown & Annie L. Nguyen & Nourhan M. Aly & Passent Ellakany & Ifeoma E. Idigbe & Abeedha Tu-Allah Khan & Folake, 2022. "Associations between Emotional Distress, Sleep Changes, Decreased Tooth Brushing Frequency, Self-Reported Oral Ulcers and SARS-Cov-2 Infection during the First Wave of the COVID-19 Pandemic: A Global ," IJERPH, MDPI, vol. 19(18), pages 1-11, September.
    2. Shannon Davis & Andrey Shevchuk & Denis Strebkov, 2014. "Pathways to Satisfaction with Work-Life Balance: The Case of Russian-Language Internet Freelancers," Journal of Family and Economic Issues, Springer, vol. 35(4), pages 542-556, December.
    3. Vicente Gea-Caballero & José Ramón Martínez-Riera & Pedro García-Martínez & Jorge Casaña-Mohedo & Isabel Antón-Solanas & María Virtudes Verdeguer-Gómez & Iván Santolaya-Arnedo & Raúl Juárez-Vela, 2021. "Study of the Strengths and Weaknesses of Nursing Work Environments in Primary Care in Spain," IJERPH, MDPI, vol. 18(2), pages 1-12, January.
    4. Vendrell-Herrero, Ferran & Bustinza, Oscar F. & Opazo-Basaez, Marco, 2021. "Information technologies and product-service innovation: The moderating role of service R&D team structure," Journal of Business Research, Elsevier, vol. 128(C), pages 673-687.
    5. Rolf Becker, 2023. "Short- and long-term effects of reminders on panellists’ survey participation in a probability-based panel study with a sequential mixed-mode design," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(5), pages 4095-4119, October.
    6. Andrea Serge & Johana Quiroz Montoya & Francisco Alonso & Luis Montoro, 2021. "Socioeconomic Status, Health and Lifestyle Settings as Psychosocial Risk Factors for Road Crashes in Young People: Assessing the Colombian Case," IJERPH, MDPI, vol. 18(3), pages 1-22, January.
    7. Camilleri, Silvio John & Cortis, Justine & Fenech, Maria Diandra, 2013. "Service Quality and Internet Banking: Perceptions of Maltese Retail Bank Customers," MPRA Paper 62492, University Library of Munich, Germany.
    8. Troise, Ciro & Corvello, Vincenzo & Ghobadian, Abby & O'Regan, Nicholas, 2022. "How can SMEs successfully navigate VUCA environment: The role of agility in the digital transformation era," Technological Forecasting and Social Change, Elsevier, vol. 174(C).
    9. Ahmed Hassan Abdou & Majed Abdulaziz Al Abdulathim & Nadia Rebhi Hussni Hasan & Maha Hassan Ahmed Salah & Howayda Said Ahmed Mohamed Ali & Nancy J. Kamel, 2023. "From Green Inclusive Leadership to Green Organizational Citizenship: Exploring the Mediating Role of Green Work Engagement and Green Organizational Identification in the Hotel Industry Context," Sustainability, MDPI, vol. 15(20), pages 1-22, October.
    10. Christopher Benjamin Menadue & Susan Jacups, 2018. "Who Reads Science Fiction and Fantasy, and How Do They Feel About Science? Preliminary Findings From an Online Survey," SAGE Open, , vol. 8(2), pages 21582440187, June.
    11. Konstantinos Nikolaos Vasileiadis & Konstantinos Alexandros Tsioumis & Argyris Kyridis, 2013. "The Effects of Dominant Ideology on Teachers¡¯ Perceptions and Practices towards the ¡°Other¡±," International Journal of Learning and Development, Macrothink Institute, vol. 3(1), pages 33-48, December.
    12. Beaton, Anthony A. & Funk, Daniel C. & Ridinger, Lynn & Jordan, Jeremy, 2011. "Sport involvement: A conceptual and empirical analysis," Sport Management Review, Elsevier, vol. 14(2), pages 126-140, May.
    13. Gryszel Piotr & Zawadzki Piotr & Pełka Marcin, 2023. "The Use of Social Media in City Marketing Communication with Residents and Tourists – User Segmentation," Polish Journal of Sport and Tourism, Sciendo, vol. 30(1), pages 27-32, March.
    14. Magnus Olsén Hammarfjord & Tommy Roxenhall, 2017. "The Relationships Between Network Commitment, Antecedents, And Innovation In Strategic Innovation Networks," International Journal of Innovation Management (ijim), World Scientific Publishing Co. Pte. Ltd., vol. 21(04), pages 1-36, May.
    15. Lara Fontanella & Paola Villano & Marika Di Donato, 2016. "Attitudes towards Roma people and migrants: a comparison through a Bayesian multidimensional IRT model," Quality & Quantity: International Journal of Methodology, Springer, vol. 50(2), pages 471-490, March.
    16. Debbie Haski-Leventhal & Mehrdokht Pournader & Andrew McKinnon, 2017. "The Role of Gender and Age in Business Students’ Values, CSR Attitudes, and Responsible Management Education: Learnings from the PRME International Survey," Journal of Business Ethics, Springer, vol. 146(1), pages 219-239, November.
    17. Vendrell-Herrero, Ferran & Gomes, Emanuel & Bustinza, Oscar F. & Mellahi, Kamel, 2018. "Uncovering the role of cross-border strategic alliances and expertise decision centralization in enhancing product-service innovation in MMNEs," International Business Review, Elsevier, vol. 27(4), pages 814-825.
    18. Ioan-Sebastian Brumă & Cristina Cautisanu & Lucian Tanasă & Simona-Roxana Ulman & Meda Gâlea & Alexandra Raluca Jelea, 2024. "Does the payment method matter in online shopping behaviour? Study on the Romanian market of vegetables during the pandemic crisis," Agricultural Economics, Czech Academy of Agricultural Sciences, vol. 70(1), pages 34-47.
    19. Evelyn Schapansky & Joke Depraetere & Ines Keygnaert & Christophe Vandeviver, 2021. "Prevalence and Associated Factors of Sexual Victimization: Findings from a National Representative Sample of Belgian Adults Aged 16–69," IJERPH, MDPI, vol. 18(14), pages 1-22, July.
    20. Martin Loidl & Christian Werner & Laura Heym & Patrick Kofler & Günther Innerebner, 2019. "Lifestyles and Cycling Behavior—Data from a Cross-Sectional Study," Data, MDPI, vol. 4(4), pages 1-19, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:qualqt:v:56:y:2022:i:4:d:10.1007_s11135-021-01252-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.