IDEAS home Printed from https://ideas.repec.org/a/cup/apsrev/v110y2016i02p278-295_00.html
   My bibliography  Save this article

Crowd-sourced Text Analysis: Reproducible and Agile Production of Political Data

Author

Listed:
  • BENOIT, KENNETH
  • CONWAY, DREW
  • LAUDERDALE, BENJAMIN E.
  • LAVER, MICHAEL
  • MIKHAYLOV, SLAVA

Abstract

Empirical social science often relies on data that are not observed in the field, but are transformed into quantitative variables by expert researchers who analyze and interpret qualitative raw sources. While generally considered the most valid way to produce data, this expert-driven process is inherently difficult to replicate or to assess on grounds of reliability. Using crowd-sourcing to distribute text for reading and interpretation by massive numbers of nonexperts, we generate results comparable to those using experts to read and interpret the same texts, but do so far more quickly and flexibly. Crucially, the data we collect can be reproduced and extended transparently, making crowd-sourced datasets intrinsically reproducible. This focuses researchers’ attention on the fundamental scientific objective of specifying reliable and replicable methods for collecting the data needed, rather than on the content of any particular dataset. We also show that our approach works straightforwardly with different types of political text, written in different languages. While findings reported here concern text analysis, they have far-reaching implications for expert-generated data in the social sciences.

Suggested Citation

  • Benoit, Kenneth & Conway, Drew & Lauderdale, Benjamin E. & Laver, Michael & Mikhaylov, Slava, 2016. "Crowd-sourced Text Analysis: Reproducible and Agile Production of Political Data," American Political Science Review, Cambridge University Press, vol. 110(2), pages 278-295, May.
  • Handle: RePEc:cup:apsrev:v:110:y:2016:i:02:p:278-295_00
    as

    Download full text from publisher

    File URL: https://www.cambridge.org/core/product/identifier/S0003055416000058/type/journal_article
    File Function: link to article abstract page
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Cindy Cheng & Joan Barcelo & Allison Spencer Hartnett & Robert Kubinec & Luca Messerschmidt, 2020. "CoronaNet: A Dyadic Dataset of Government Responses to the COVID-19 Pandemic," Working Papers 20200042, New York University Abu Dhabi, Department of Social Science, revised Apr 2020.
    2. Ginevra Floridi & Benjamin E. Lauderdale, 2022. "Pairwise comparisons as a scale development tool for composite measures," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(2), pages 519-542, April.
    3. Yannis Theocharis & Pablo Barberá & Zoltán Fazekas & Sebastian Adrian Popa, 2020. "The Dynamics of Political Incivility on Twitter," SAGE Open, , vol. 10(2), pages 21582440209, May.
    4. Keren Weinshall & Lee Epstein, 2020. "Developing High‐Quality Data Infrastructure for Legal Analytics: Introducing the Israeli Supreme Court Database," Journal of Empirical Legal Studies, John Wiley & Sons, vol. 17(2), pages 416-434, June.
    5. Elio Amicarelli & Jessica Di Salvatore, 2021. "Introducing the PeaceKeeping Operations Corpus (PKOC)," Journal of Peace Research, Peace Research Institute Oslo, vol. 58(5), pages 1137-1148, September.
    6. Miriam Sorace, 2018. "The European Union democratic deficit: Substantive representation in the European Parliament at the input stage," European Union Politics, , vol. 19(1), pages 3-24, March.
    7. Stephen A Meserve & Sivagaminathan Palani & Daniel Pemstein, 2018. "Measuring candidate selection mechanisms in European elections: Comparing formal party rules to candidate survey responses," European Union Politics, , vol. 19(1), pages 185-202, March.
    8. Wessel Wijtvliet & Arthur Dyevre, 2021. "Judicial ideology in economic cases: Evidence from the General Court of the European Union," European Union Politics, , vol. 22(1), pages 25-45, March.
    9. Christopher J Fariss & James Lo, 2020. "Innovations in concepts and measurement for the study of peace and conflict," Journal of Peace Research, Peace Research Institute Oslo, vol. 57(6), pages 669-678, November.
    10. Joshua Robison & Randy T. Stevenson & James N. Druckman & Simon Jackman & Jonathan N. Katz & Lynn Vavreck, 2018. "An Audit of Political Behavior Research," SAGE Open, , vol. 8(3), pages 21582440187, August.
    11. Zobel, Malisa & Lehmann, Pola, 2018. "Positions and saliency of immigration in party manifestos: A novel dataset using crowd coding," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 57(4), pages 1056-1083.
    12. Sorace, Miriam, 2018. "The European Union democratic deficit: substantive representation in the European Parliament at the input stage," LSE Research Online Documents on Economics 87625, London School of Economics and Political Science, LSE Library.
    13. Martin Haselmayer & Marcelo Jenny, 2017. "Sentiment analysis of political communication: combining a dictionary approach with crowdcoding," Quality & Quantity: International Journal of Methodology, Springer, vol. 51(6), pages 2623-2646, November.
    14. Anton Oleinik, 2024. "A Bayesian index of association: comparison with other measures and performance," Quality & Quantity: International Journal of Methodology, Springer, vol. 58(1), pages 277-305, February.
    15. Cindy Cheng & Joan Barceló & Allison Spencer Hartnett & Robert Kubinec & Luca Messerschmidt, 2020. "COVID-19 Government Response Event Dataset (CoronaNet v.1.0)," Nature Human Behaviour, Nature, vol. 4(7), pages 756-768, July.
    16. Mubashir Qasim, 2019. "Sustainability and Wellbeing: A Text Analysis of New Zealand Parliamentary Debates, Official Yearbooks and Ministerial Documents," Working Papers in Economics 19/01, University of Waikato.
    17. Kostovicova Denisa & Kerr Rachel & Sokolić Ivor & Fairey Tiffany & Redwood Henry & Subotić Jelena, 2022. "The “Digital Turn” in Transitional Justice Research: Evaluating Image and Text as Data in the Western Balkans," Comparative Southeast European Studies, De Gruyter, vol. 70(1), pages 24-46, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:cup:apsrev:v:110:y:2016:i:02:p:278-295_00. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Kirk Stebbing (email available below). General contact details of provider: https://www.cambridge.org/psr .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.