Author
Listed:
- Seth Goodman
- Katherine Nolan
- Rachel Sayers
- Ariel BenYishay
- Jacob Hall
- Mavis Zupork Dome
- Edem Selormey
Abstract
Household surveys have been the foundation for poverty measurement in developing countries for the past half-century, but the spatial and temporal gaps in these survey data often limit how well anti-poverty programs can be targeted, monitored, or evaluated. To fill in these gaps, analysts and policymakers increasingly turn to machine learning (ML) methods to predict indices of asset wealth from satellite-based and other geospatial data. However, to date, the potential for gender-related differences in these methods’ performance has not been investigated. We implement a frequently used class of ML models (random forests) relying on readily accessible geospatial data and trained on and validated against a widely used source of asset holdings (a recent round of the Demographic & Health Survey in Ghana). By separately aggregating the asset holdings of female- and male-headed households within each survey cluster, we are able to estimate the distinctions in performance of ML models trained on each of these gender-based asset indices. We find that models trained on data from male-headed households achieve an impressive level of predictive accuracy (R2 = 0.85), while those trained on data from female-headed households achieve reasonable but notably lower accuracy (R2 = 0.75). Roughly half of this gap appears to be driven in large part by the relatively smaller number of female-headed households in the survey sample. While we cannot rule out that the ML models themselves play a role in creating differences in performance across gender, it appears that these gaps may largely be a reflection of the sampling designs of the underlying survey data used as inputs for these models. Our findings confirm that ML models can be used to extend the spatial and temporal scope of these survey data to populations that were not randomly sampled, even while encouraging larger samples of female-headed households in survey designs to improve the predictive accuracy of ML models for female-headed households.
Suggested Citation
Seth Goodman & Katherine Nolan & Rachel Sayers & Ariel BenYishay & Jacob Hall & Mavis Zupork Dome & Edem Selormey, 2025.
"Equitable AI: Exploring the role of gender in poverty estimation models using geospatial data,"
PLOS ONE, Public Library of Science, vol. 20(9), pages 1-19, September.
Handle:
RePEc:plo:pone00:0332193
DOI: 10.1371/journal.pone.0332193
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0332193. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.