IDEAS home Printed from
   My bibliography  Save this paper

Measuring Human Capital with Social Media Data and Machine Learning


  • Martina Jakob
  • Sebastian Heinrich


In response to persistent gaps in the availability of survey data, a new strand of research leverages alternative data sources through machine learning to track global development. While previous applications have been successful at predicting outcomes such as wealth, poverty or population density, we show that educational outcomes can be accurately estimated using geo-coded Twitter data and machine learning. Based on various input features, including user and tweet characteristics, topics, spelling mistakes, and network indicators, we can account for ~70 percent of the variation in educational attainment in Mexican municipalities and US counties.

Suggested Citation

  • Martina Jakob & Sebastian Heinrich, 2023. "Measuring Human Capital with Social Media Data and Machine Learning," University of Bern Social Sciences Working Papers 46, University of Bern, Department of Social Sciences.
  • Handle: RePEc:bss:wpaper:46

    Download full text from publisher

    File URL:
    File Function: First version, 2023
    Download Restriction: no

    References listed on IDEAS

    1. Serina Chang & Emma Pierson & Pang Wei Koh & Jaline Gerardin & Beth Redbird & David Grusky & Jure Leskovec, 2021. "Mobility network models of COVID-19 explain inequities and inform reopening," Nature, Nature, vol. 589(7840), pages 82-87, January.
    2. Raj Chetty & Matthew O. Jackson & Theresa Kuchler & Johannes Stroebel & Nathaniel Hendren & Robert B. Fluegge & Sara Gong & Federico Gonzalez & Armelle Grondin & Matthew Jacob & Drew Johnston & Martin, 2022. "Social capital I: measurement and associations with economic mobility," Nature, Nature, vol. 608(7921), pages 108-121, August.
    3. Christopher Yeh & Anthony Perez & Anne Driscoll & George Azzari & Zhongyi Tang & David Lobell & Stefano Ermon & Marshall Burke, 2020. "Using publicly available satellite imagery and deep learning to understand economic well-being in Africa," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    4. King, Gary & Pan, Jennifer & Roberts, Margaret E., 2013. "How Censorship in China Allows Government Criticism but Silences Collective Expression," American Political Science Review, Cambridge University Press, vol. 107(2), pages 326-343, May.
    5. Nathan Ratledge & Gabriel Cadamuro & Brandon De la Cuesta & Matthieu Stigler & Marshall Burke, 2021. "Using Satellite Imagery and Machine Learning to Estimate the Livelihood Impact of Electricity Access," NBER Working Papers 29237, National Bureau of Economic Research, Inc.
    6. Barro, Robert J. & Lee, Jong Wha, 2013. "A new data set of educational attainment in the world, 1950–2010," Journal of Development Economics, Elsevier, vol. 104(C), pages 184-198.
    7. Emily Aiken & Suzanne Bellue & Dean Karlan & Chris Udry & Joshua E. Blumenstock, 2022. "Machine learning and phone data can improve targeting of humanitarian aid," Nature, Nature, vol. 603(7903), pages 864-870, March.
    8. Brenda Curtis & Salvatore Giorgi & Anneke E K Buffone & Lyle H Ungar & Robert D Ashford & Jessie Hemmons & Dan Summers & Casey Hamilton & H Andrew Schwartz, 2018. "Can Twitter be used to predict county excessive alcohol consumption rates?," PLOS ONE, Public Library of Science, vol. 13(4), pages 1-16, April.
    9. Nathan Ratledge & Gabe Cadamuro & Brandon de la Cuesta & Matthieu Stigler & Marshall Burke, 2021. "Using Satellite Imagery and Machine Learning to Estimate the Livelihood Impact of Electricity Access," Papers 2109.02890,
    Full references (including those not matched with items on IDEAS)


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Martina Jakob, Konstantin Buechel, Daniel Steffen, Aymo Brunetti, 2023. "Participatory Teaching Improves Learning Outcomes: Evidence from a Field Experiment in Tanzania," Diskussionsschriften dp2310, Universitaet Bern, Departement Volkswirtschaft.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Adel Daoud & Felipe Jordán & Makkunda Sharma & Fredrik Johansson & Devdatt Dubhashi & Sourabh Paul & Subhashis Banerjee, 2023. "Using Satellite Images and Deep Learning to Measure Health and Living Standards in India," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 167(1), pages 475-505, June.
    2. Takahiro Yabe & Bernardo García Bulle Bueno & Xiaowen Dong & Alex Pentland & Esteban Moro, 2023. "Behavioral changes during the COVID-19 pandemic decreased income diversity of urban encounters," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    3. António Afonso & José Alves & Krzysztof Beck, 2022. "Pay and unemployment determinants of migration flows in the European Union," Working Papers REM 2022/0251, ISEG - Lisbon School of Economics and Management, REM, Universidade de Lisboa.
    4. Lo Turco, Alessia & Maggioni, Daniela & Zazzaro, Alberto, 2019. "Financial dependence and growth: The role of input-output linkages," Journal of Economic Behavior & Organization, Elsevier, vol. 162(C), pages 308-328.
    5. Leopoldo Fergusson & Carlos Molina, 2020. "Facebook Causes Protests," HiCN Working Papers 323, Households in Conflict Network.
    6. Ufuk Akcigit & Murat Celik & Daron Acemoglu, 2014. "Young, Restless and Creative: Openness to Disruption and Creative Innovations," 2014 Meeting Papers 377, Society for Economic Dynamics.
    7. Iamsiraroj, Sasi, 2016. "The foreign direct investment–economic growth nexus," International Review of Economics & Finance, Elsevier, vol. 42(C), pages 116-133.
    8. Jungho Kim, 2023. "Female education and its impact on fertility," IZA World of Labor, Institute of Labor Economics (IZA), pages 228-228, May.
    9. Markus Brueckner & Daniel Lederman, 2018. "Inequality and economic growth: the role of initial income," Journal of Economic Growth, Springer, vol. 23(3), pages 341-366, September.
    10. Rosario Crinò & Paolo Epifani, 2014. "Trade Imbalances, Export Structure and Wage Inequality," Economic Journal, Royal Economic Society, vol. 0(576), pages 507-539, May.
    11. Marcén, Miriam & Molina, José Alberto & Morales, Marina, 2018. "The effect of culture on the fertility decisions of immigrant women in the United States," Economic Modelling, Elsevier, vol. 70(C), pages 15-28.
    12. Löschel, Andreas & Pothen, Frank & Schymura, Michael, 2015. "Peeling the onion: Analyzing aggregate, national and sectoral energy intensity in the European Union," Energy Economics, Elsevier, vol. 52(S1), pages 63-75.
    13. Oasis Kodila-Tedika & Julius Agbor, 2016. "Does Trust Matter for Entrepreneurship: Evidence from a Cross-Section of Countries," Economies, MDPI, vol. 4(1), pages 1-17, March.
    14. Barnabé Walheer, 2021. "A directional technology convergence index," Economics Bulletin, AccessEcon, vol. 41(3), pages 1330-1337.
    15. Paolo Di Caro & Roberta Arbolino & Ugo Marani, 2018. "A note on the effects of human capital policies in Italy during the Great Recession," Economics Bulletin, AccessEcon, vol. 38(3), pages 1302-1312.
    16. Haichao Fan & Xiang Gao, 2017. "Domestic Creditor Rights and External Private Debt," Economic Journal, Royal Economic Society, vol. 127(606), pages 2410-2440, November.
    17. Oliver Denk & Boris Cournède, 2015. "Finance and income inequality in OECD countries," OECD Economics Department Working Papers 1224, OECD Publishing.
    18. Jeni Klugman & Francisco Rodríguez & Hyung-Jin Choi, 2011. "The HDI 2010: new controversies, old critiques," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 9(2), pages 249-288, June.
    19. Castelló-Climent, Amparo & Mukhopadhyay, Abhiroop, 2013. "Mass education or a minority well educated elite in the process of growth: The case of India," Journal of Development Economics, Elsevier, vol. 105(C), pages 303-320.
    20. Sodiq Arogundade & Mduduzi Biyase & Hinaunye Eita, 2021. "Foreign Direct Investment and Inclusive Human Development in Sub-Saharan African Countries:Does local Economic Conditions Matter?," Economic Development and Well-being Research Group Working Paper Series edwrg-01-2021, University of Johannesburg, College of Business and Economics, revised 2021.

    More about this item


    machine learning; social media data; education; human capital; indicators; natural language processing;
    All these keywords.

    JEL classification:

    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • C80 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - General
    • O11 - Economic Development, Innovation, Technological Change, and Growth - - Economic Development - - - Macroeconomic Analyses of Economic Development
    • O15 - Economic Development, Innovation, Technological Change, and Growth - - Economic Development - - - Economic Development: Human Resources; Human Development; Income Distribution; Migration
    • I21 - Health, Education, and Welfare - - Education - - - Analysis of Education
    • I25 - Health, Education, and Welfare - - Education - - - Education and Economic Development

    NEP fields

    This paper has been announced in the following NEP Reports:


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bss:wpaper:46. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Ben Jann (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.