IDEAS home Printed from https://ideas.repec.org/p/wbk/wbrwps/10077.html
   My bibliography  Save this paper

Combining Survey and Geospatial Data Can Significantly Improve Gender-DisaggregatedEstimates of Labor Market Outcomes

Author

Listed:
  • Merfeld,Joshua David
  • Newhouse,David Locke
  • Weber,Michael
  • Lahiri,Partha

Abstract

Better understanding the geography of women’s labor market outcomes within countries is importantto inform targeted efforts to increase women’s economic empowerment. This paper assesses the extent to which amethod that combines simulated survey data from urban areas in Mexico with broadly available geospatial indicators fromGoogle Earth Engine and OpenStreetMap can significantly improve estimates of labor force participation andunemployment rates. Incorporating geospatial information substantially increases the accuracy of male and femalelabor force participation and unemployment rates at the state level, reducing mean absolute deviation by 50 to 62percent for labor force participation and 25 to 52 percent for unemployment. Small area estimation using a nested errorconditional random effect model also greatly improves municipal estimates of labor force participation, as themean absolute error falls by approximately half, while the mean squared error falls by almost 75 percent when holdingcoverage rates constant. In contrast, the results for municipal unemployment rate estimates are not reliablebecause values of unemployment rates are low and therefore poorly suited for linear models. The municipal results holdin repeated simulations of alternative samples. Models utilizing Basic Geo-Statistical Area (AGEB)–level auxiliaryinformation generate more accurate predictions than area-level models specified using the same auxiliary data.Overall, integrating survey data and publicly available geospatial indicators is feasible and can greatly improvestate-level estimates of male and female labor force participation and unemployment rates, as well as municipalestimates of male and female labor force participation.

Suggested Citation

  • Merfeld,Joshua David & Newhouse,David Locke & Weber,Michael & Lahiri,Partha, 2022. "Combining Survey and Geospatial Data Can Significantly Improve Gender-DisaggregatedEstimates of Labor Market Outcomes," Policy Research Working Paper Series 10077, The World Bank.
  • Handle: RePEc:wbk:wbrwps:10077
    as

    Download full text from publisher

    File URL: http://documents.worldbank.org/curated/en/099321406092229138/pdf/IDU016f95e0806fc6044ea0b843007d5dc0ef17e.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Stephan Klasen, 2019. "What Explains Uneven Female Labor Force Participation Levels and Trends in Developing Countries?," The World Bank Research Observer, World Bank, vol. 34(2), pages 161-197.
    2. Sophia Rabe‐Hesketh & Anders Skrondal, 2006. "Multilevel modelling of complex survey data," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 169(4), pages 805-827, October.
    3. Isabel Molina & Ayoub Saei & M. José Lombardía, 2007. "Small area estimates of labour force participation under a multinomial logit mixed model," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(4), pages 975-1000, October.
    4. Linden McBride & Christopher B. Barrett & Christopher Browne & Leiqiu Hu & Yanyan Liu & David S. Matteson & Ying Sun & Jiaming Wen, 2022. "Predicting poverty and malnutrition for targeting, mapping, monitoring, and early warning," Applied Economic Perspectives and Policy, John Wiley & Sons, vol. 44(2), pages 879-892, June.
    5. Yolanda Marhuenda & Isabel Molina & Domingo Morales & J. N. K. Rao, 2017. "Poverty mapping in small areas under a twofold nested error regression model," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(4), pages 1111-1136, October.
    6. Ray Chambers & Nicola Salvati & Nikos Tzavidis, 2016. "Semiparametric small area estimation for binary outcomes with application to unemployment estimation for local authorities in the UK," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 179(2), pages 453-479, February.
    7. Jiming Jiang & P. Lahiri, 2001. "Empirical Best Prediction for Small Area Inference with Binary Data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 53(2), pages 217-243, June.
    8. Yan Li & Partha Lahiri, 2019. "A Simple Adaptation of Variable Selection Software for Regression Models to Select Variables in Nested Error Regression Models," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 81(2), pages 302-317, December.
    9. Zhang, Yiyun & Li, Runze & Tsai, Chih-Ling, 2010. "Regularization Parameter Selections via Generalized Information Criterion," Journal of the American Statistical Association, American Statistical Association, vol. 105(489), pages 312-323.
    10. Andreea L. Erciulescu & Nathan B. Cruze & Balgobin Nandram, 2019. "Model‐based county level crop estimates incorporating auxiliary sources of information," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 182(1), pages 283-303, January.
    11. Christopher Yeh & Anthony Perez & Anne Driscoll & George Azzari & Zhongyi Tang & David Lobell & Stefano Ermon & Marshall Burke, 2020. "Using publicly available satellite imagery and deep learning to understand economic well-being in Africa," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    12. David B Lobell & George Azzari & Marshall Burke & Sydney Gourlay & Zhenong Jin & Talip Kilic & Siobhan Murray, 2020. "Eyes in the Sky, Boots on the Ground: Assessing Satellite‐ and Ground‐Based Approaches to Crop Yield Measurement and Analysis," American Journal of Agricultural Economics, John Wiley & Sons, vol. 102(1), pages 202-219, January.
    13. Federico Belotti & Partha Deb & Willard G. Manning & Edward C. Norton, 2015. "twopm: Two-part models," Stata Journal, StataCorp LP, vol. 15(1), pages 3-20, March.
    14. Masaki,Takaaki & Newhouse,David Locke & Silwal,Ani Rudra & Bedada,Adane & Engstrom,Ryan, 2020. "Small Area Estimation of Non-Monetary Poverty with Geospatial Data," Policy Research Working Paper Series 9383, The World Bank.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Newhouse,David Locke & Merfeld,Joshua David & Ramakrishnan,Anusha Pudugramam & Swartz,Tom & Lahiri,Partha, 2022. "Small Area Estimation of Monetary Poverty in Mexico Using Satellite Imagery and Machine Learning," Policy Research Working Paper Series 10175, The World Bank.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Angelo Moretti, 2023. "Estimation of small area proportions under a bivariate logistic mixed model," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(4), pages 3663-3684, August.
    2. María Dolores Esteban & María José Lombardía & Esther López‐Vizcaíno & Domingo Morales & Agustín Pérez, 2022. "Empirical best prediction of small area bivariate parameters," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(4), pages 1699-1727, December.
    3. Newhouse,David Locke & Merfeld,Joshua David & Ramakrishnan,Anusha Pudugramam & Swartz,Tom & Lahiri,Partha, 2022. "Small Area Estimation of Monetary Poverty in Mexico Using Satellite Imagery and Machine Learning," Policy Research Working Paper Series 10175, The World Bank.
    4. Masaki,Takaaki & Newhouse,David Locke & Silwal,Ani Rudra & Bedada,Adane & Engstrom,Ryan, 2020. "Small Area Estimation of Non-Monetary Poverty with Geospatial Data," Policy Research Working Paper Series 9383, The World Bank.
    5. Joscha Krause & Jan Pablo Burgard & Domingo Morales, 2022. "$$\ell _2$$ ℓ 2 -penalized approximate likelihood inference in logit mixed models for regional prevalence estimation under covariate rank-deficiency," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 85(4), pages 459-489, May.
    6. M. Giovanna Ranalli & Giorgio E. Montanari & Cecilia Vicarelli, 2018. "Estimation of small area counts with the benchmarking property," METRON, Springer;Sapienza Università di Roma, vol. 76(3), pages 349-378, December.
    7. María Dolores Esteban & María José Lombardía & Esther López-Vizcaíno & Domingo Morales & Agustín Pérez, 2020. "Small area estimation of proportions under area-level compositional mixed models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(3), pages 793-818, September.
    8. GIBSON, John & ZHANG, Xiaoxuan & PARK, Albert & YI, Jiang & XI, Li, 2024. "Remotely measuring rural economic activity and poverty : Do we just need better sensors?," CEI Working Paper Series 2023-08, Center for Economic Institutions, Institute of Economic Research, Hitotsubashi University.
    9. Ola Hall & Francis Dompae & Ibrahim Wahab & Fred Mawunyo Dzanku, 2023. "A review of machine learning and satellite imagery for poverty prediction: Implications for development research and applications," Journal of International Development, John Wiley & Sons, Ltd., vol. 35(7), pages 1753-1768, October.
    10. James Dawber & Nicola Salvati & Enrico Fabrizi & Nikos Tzavidis, 2022. "Expectile regression for multi‐category outcomes with application to small area estimation of labour force participation," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(S2), pages 590-619, December.
    11. Yujun Zhou & Erin Lentz & Hope Michelson & Chungmann Kim & Kathy Baylis, 2022. "Machine learning for food security: Principles for transparency and usability," Applied Economic Perspectives and Policy, John Wiley & Sons, vol. 44(2), pages 893-910, June.
    12. Anders Skrondal & Sophia Rabe‐Hesketh, 2009. "Prediction in multilevel generalized linear models," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 172(3), pages 659-687, June.
    13. Hao Sun & Emily Berg & Zhengyuan Zhu, 2022. "Bivariate small‐area estimation for binary and gaussian variables based on a conditionally specified model," Biometrics, The International Biometric Society, vol. 78(4), pages 1555-1565, December.
    14. Maryia Bakhtsiyarava & Tim G. Williams & Andrew Verdin & Seth D. Guikema, 2021. "A nonparametric analysis of household-level food insecurity and its determinant factors: exploratory study in Ethiopia and Nigeria," Food Security: The Science, Sociology and Economics of Food Production and Access to Food, Springer;The International Society for Plant Pathology, vol. 13(1), pages 55-70, February.
    15. Binh Tang & Yanyan Liu & David S. Matteson, 2022. "Predicting poverty with vegetation index," Applied Economic Perspectives and Policy, John Wiley & Sons, vol. 44(2), pages 930-945, June.
    16. Li, Qing & Yu, Shuai & Échevin, Damien & Fan, Min, 2022. "Is poverty predictable with machine learning? A study of DHS data from Kyrgyzstan," Socio-Economic Planning Sciences, Elsevier, vol. 81(C).
    17. Domingo Morales & Joscha Krause & Jan Pablo Burgard, 2022. "On the Use of Aggregate Survey Data for Estimating Regional Major Depressive Disorder Prevalence," Psychometrika, Springer;The Psychometric Society, vol. 87(1), pages 344-368, March.
    18. Corral Rodas,Paul Andres & Kastelic,Kristen Himelein & Mcgee,Kevin Robert & Molina,Isabel, 2021. "A Map of the Poor or a Poor Map ?," Policy Research Working Paper Series 9620, The World Bank.
    19. Joscha Krause & Jan Pablo Burgard & Domingo Morales, 2022. "Robust prediction of domain compositions from uncertain data using isometric logratio transformations in a penalized multivariate Fay–Herriot model," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 76(1), pages 65-96, February.
    20. Timo Schmid & Fabian Bruckschen & Nicola Salvati & Till Zbiranski, 2017. "Constructing sociodemographic indicators for national statistical institutes by using mobile phone data: estimating literacy rates in Senegal," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(4), pages 1163-1190, October.

    More about this item

    JEL classification:

    • J21 - Labor and Demographic Economics - - Demand and Supply of Labor - - - Labor Force and Employment, Size, and Structure
    • C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wbk:wbrwps:10077. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Roula I. Yazigi (email available below). General contact details of provider: https://edirc.repec.org/data/dvewbus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.