IDEAS home Printed from https://ideas.repec.org/a/dem/demres/v37y2017i46.html
   My bibliography  Save this article

Using Twitter data for demographic research

Author

Listed:
  • Dilek Yildiz

    (International Institute for Applied Systems Analysis (IIASA))

  • Jo Munson

    (University of Southampton)

  • Agnese Vitali

    (Università degli Studi di Trento)

  • Ramine Tinati

    (University of Southampton)

  • Jennifer A. Holland

    (Erasmus Universiteit Rotterdam)

Abstract

Background: Social media data is a promising source of social science data. However, deriving the demographic characteristics of users and dealing with the nonrandom, nonrepresentative populations from which they are drawn represent challenges for social scientists. Objective: Given the growing use of social media data in social science research, this paper asks two questions: 1) To what extent are findings obtained with social media data generalizable to broader populations, and 2) what is the best practice for estimating demographic information from Twitter data? Methods: Our analyses use information gathered from 979,992 geo-located Tweets sent by 22,356 unique users in South East England between 23 June and 4 July 2014. We estimate demographic characteristics of the Twitter users with the crowd-sourcing platform CrowdFlower and the image-recognition software Face++. To evaluate bias in the data, we run a series of log-linear models with offsets and calibrate the nonrepresentative sample of Twitter users with mid-year population estimates for South East England. Results: CrowdFlower proves to be more accurate than Face++ for the measurement of age, whereas both tools are highly reliable for measuring the sex of Twitter users. The calibration exercise allows bias correction in the age-, sex-, and location-specific population counts obtained from the Twitter population by augmenting Twitter data with mid-year population estimates. Contribution: The paper proposes best practices for estimating Twitter users’ basic demographic characteristics and a calibration method to address the selection bias in the Twitter population, allowing researchers to generalize findings based on Twitter to the general population.

Suggested Citation

  • Dilek Yildiz & Jo Munson & Agnese Vitali & Ramine Tinati & Jennifer A. Holland, 2017. "Using Twitter data for demographic research," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 37(46), pages 1477-1514.
  • Handle: RePEc:dem:demres:v:37:y:2017:i:46
    DOI: 10.4054/DemRes.2017.37.46
    as

    Download full text from publisher

    File URL: https://www.demographic-research.org/volumes/vol37/46/37-46.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.4054/DemRes.2017.37.46?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Peter W. F. Smith & James Raymer & Corrado Giulietti, 2010. "Combining available migration data in England to study economic activity flows over time," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 173(4), pages 733-753, October.
    2. Frans Willekens, 1999. "Modeling approaches to the indirect estimation of migration flows: From entropy to EM," Mathematical Population Studies, Taylor & Francis Journals, vol. 7(3), pages 239-278.
    3. James Raymer & Guy Abel & Peter W. F. Smith, 2007. "Combining census and registration data to estimate detailed elderly migration flows in England and Wales," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(4), pages 891-908, October.
    4. Emilio Zagheni & Ingmar Weber, 2015. "Demographic research with non-representative internet data," International Journal of Manpower, Emerald Group Publishing Limited, vol. 36(1), pages 13-25, April.
    5. Luke Sloan & Jeffrey Morgan & William Housley & Matthew Williams & Adam Edwards & Pete Burnap & Omer Rana, 2013. "Knowing the Tweeters: Deriving Sociologically Relevant Demographics from Twitter," Sociological Research Online, , vol. 18(3), pages 74-84, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sekou Keita & Thomas Renault & Jérôme Valette, 2024. "The Usual Suspects: Offender Origin, Media Reporting and Natives’ Attitudes Towards Immigration," The Economic Journal, Royal Economic Society, vol. 134(657), pages 322-362.
    2. Alina Sîrbu & Diletta Goglia & Jisu Kim & Paul Maximilian Magos & Laura Pollacci & Spyridon Spyratos & Giulio Rossetti & Stefano Maria Iacus, 2024. "International mobility between the UK and Europe around Brexit: a data-driven study," Journal of Computational Social Science, Springer, vol. 7(2), pages 1451-1482, October.
    3. Martina Patone & Li‐Chun Zhang, 2021. "On Two Existing Approaches to Statistical Analysis of Social Media Data," International Statistical Review, International Statistical Institute, vol. 89(1), pages 54-71, April.
    4. Monica Alexander & Kivan Polimis & Emilio Zagheni, 2022. "Combining Social Media and Survey Data to Nowcast Migrant Stocks in the United States," Population Research and Policy Review, Springer;Southern Demographic Association (SDA), vol. 41(1), pages 1-28, February.
    5. Spyridon Spyratos & Michele Vespe & Fabrizio Natale & Ingmar Weber & Emilio Zagheni & Marzia Rango, 2019. "Quantifying international human mobility patterns using Facebook Network data," PLOS ONE, Public Library of Science, vol. 14(10), pages 1-22, October.
    6. Emiliano Gobbo & Lara Fontanella & Sara Fontanella & Annalina Sarra, 2022. "Geographies of Twitter debates," Journal of Computational Social Science, Springer, vol. 5(1), pages 647-663, May.
    7. Alexander, Monica & Zagheni, Emilio & Polimis, Kivan, 2019. "The impact of Hurricane Maria on out-migration from Puerto Rico: Evidence from Facebook data," SocArXiv 39s6c, Center for Open Science.
    8. Stephane Helleringer & Chong You & Laurence Fleury & Laetitia Douillot & Insa Diouf & Cheikh Tidiane Ndiaye & Valerie Delaunay & Rene Vidal, 2019. "Improving age measurement in low- and middle-income countries through computer vision: A test in Senegal," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 40(9), pages 219-260.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Stefan Jestl & Mathias Moser & Anna Katharina Raggl, 2021. "Cannot keep up with the Joneses: how relative deprivation pushes internal migration in Austria," International Journal of Social Economics, Emerald Group Publishing Limited, vol. 49(2), pages 210-231, November.
    2. Peter W. F. Smith & James Raymer & Corrado Giulietti, 2010. "Combining available migration data in England to study economic activity flows over time," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 173(4), pages 733-753, October.
    3. Yildiz Dilek & Smith Peter W.F., 2015. "Models for Combining Aggregate-Level Administrative Data in the Absence of a Traditional Census," Journal of Official Statistics, Sciendo, vol. 31(3), pages 431-451, September.
    4. L. B. Karachurina, 2020. "Attractiveness of Centers and Secondary Cities of Regions for Internal Migrants in Russia," Regional Research of Russia, Springer, vol. 10(3), pages 352-359, July.
    5. Barslund, Mikkel & Busse, Matthias, 2016. "How mobile is tech talent? A case study of IT professionals based on data from LinkedIn," CEPS Papers 11692, Centre for European Policy Studies.
    6. Willekens Frans, 2019. "Evidence-Based Monitoring of International Migration Flows in Europe," Journal of Official Statistics, Sciendo, vol. 35(1), pages 231-277, March.
    7. Ilya Kashnitsky & Nikita Mkrtchyan & Oleg Leshukov, 2016. "Interregional Migration of Youths in Russia: A Comprehensive Analysis of Demographic Statistics," Voprosy obrazovaniya / Educational Studies Moscow, National Research University Higher School of Economics, issue 3, pages 169-203.
    8. Klein, Jordan D. & Weber, Ingmar & Zagheni, Emilio, 2022. "Stop, in the name of COVID!," SocArXiv s3ztq, Center for Open Science.
    9. Steve Kirkwood & Viviene Cree & Daniel Winterstein & Alex Nuttgens & Jenni Sneddon, 2018. "Encountering #Feminism on Twitter: Reflections on a Research Collaboration between Social Scientists and Computer Scientists," Sociological Research Online, , vol. 23(4), pages 763-779, December.
    10. Michel Guillot & Yan Yu, 2009. "Estimating health expectancies from two cross-sectional surveys," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 21(17), pages 503-534.
    11. Guy Abel, 2013. "Estimating global migration flow tables using place of birth data," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 28(18), pages 505-546.
    12. Stefano Breschi & Francesco Lissoni & Ernest Miguelez, 2018. "Return Migrants' Self-Selection: Evidence for Indian Inventors," NBER Chapters, in: The Roles of Immigrants and Foreign Students in US Science, Innovation, and Entrepreneurship, pages 17-48, National Bureau of Economic Research, Inc.
    13. Paul Chappell & Mike Tse & Minhao Zhang & Susan Moore, 2017. "Using GPS Geo-tagged Social Media Data and Geodemographics to Investigate Social Differences: A Twitter Pilot Study," Sociological Research Online, , vol. 22(3), pages 38-56, September.
    14. Spyridon Spyratos & Michele Vespe & Fabrizio Natale & Ingmar Weber & Emilio Zagheni & Marzia Rango, 2019. "Quantifying international human mobility patterns using Facebook Network data," PLOS ONE, Public Library of Science, vol. 14(10), pages 1-22, October.
    15. P. N. Barbieri & F. Fazio & G. Gamberini, 2015. "Let Young People Join The Legislative Process. A Twitter Based Experiment On Internships," Working Papers wp995, Dipartimento Scienze Economiche, Universita' di Bologna.
    16. Liubov Antosik & Natalya Ivashina, 2021. "Factors and Routes of Interregional Migration of University Graduates in Russia," Voprosy obrazovaniya / Educational Studies Moscow, National Research University Higher School of Economics, issue 2, pages 107-125.
    17. Letizia Mencarini & Delia Irazú Hernández Farías & Mirko Lai & Viviana Patti & Emilio Sulis & Daniele Vignoli, 2019. "Happy parents’ tweets: An exploration of Italian Twitter data using sentiment analysis," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 40(25), pages 693-724.
    18. Lawrence M Berger & Giulia Ferrari & Marion Leturcq & Lidia Panico & Anne Solaz, 2021. "COVID-19 lockdowns and demographically-relevant Google Trends: A cross-national analysis," PLOS ONE, Public Library of Science, vol. 16(3), pages 1-28, March.
    19. Simionescu, Mihaela & Zimmermann, Klaus F., 2017. "Big Data and Unemployment Analysis," GLO Discussion Paper Series 81, Global Labor Organization (GLO).
    20. Katherine Curtis & Elizabeth Fussell & Jack DeWaard, 2015. "Recovery Migration After Hurricanes Katrina and Rita: Spatial Concentration and Intensification in the Migration System," Demography, Springer;Population Association of America (PAA), vol. 52(4), pages 1269-1293, August.

    More about this item

    Keywords

    population estimates; social media; Twitter; calibration;
    All these keywords.

    JEL classification:

    • J1 - Labor and Demographic Economics - - Demographic Economics
    • Z0 - Other Special Topics - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:dem:demres:v:37:y:2017:i:46. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Editorial Office (email available below). General contact details of provider: https://www.demogr.mpg.de/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.