IDEAS home Printed from
   My bibliography  Save this article

Is poverty predictable with machine learning? A study of DHS data from Kyrgyzstan


  • Li, Qing
  • Yu, Shuai
  • Échevin, Damien
  • Fan, Min


A prerequisite for eliminating poverty is to accurately identify and target the households in poverty. While some factors such as asset holdings are well recognized as relevant for assessing and predicting poverty, a priori selected indicators are not sufficient conditions for poverty and the key factors may vary from one case to another. Researchers have begun to apply machine learning algorithms to predict poor households. This paper uses the accuracy of prediction as the standard to study the application of machine learning algorithms. Using the DHS data of 8040 households in Kyrgyzstan, we apply a state-of-the-art algorithm (XGBoost) to explore the full dataset, profiting from the algorithm's ability in handling many variables, and compare the results with the a priori selected variables. We also compare XGBoost with generalized linear model (GLM), the latter being viewed as an approach in between traditional models and modern machine learning algorithms. The results imply that the inclusion of more variables is not necessarily preferable for prediction; a few important variables selected by the algorithms may also perform well. Different algorithms may select different variables as the important ones for prediction. XGBoost performs better than GLM in most cases, and machine learning is useful for variable selection. Additionally, XGBoost is particularly preferable when using a priori variables.

Suggested Citation

  • Li, Qing & Yu, Shuai & Échevin, Damien & Fan, Min, 2022. "Is poverty predictable with machine learning? A study of DHS data from Kyrgyzstan," Socio-Economic Planning Sciences, Elsevier, vol. 81(C).
  • Handle: RePEc:eee:soceps:v:81:y:2022:i:c:s0038012121001877
    DOI: 10.1016/j.seps.2021.101195

    Download full text from publisher

    File URL:
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL:
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    1. Angus Deaton, 2019. "The Analysis of Household Surveys," World Bank Publications - Books, The World Bank Group, number 30394, December.
    2. Olinto, Pedro & Beegle, Kathleen & Sobrado, Carlos & Uematsu, Hiroki, 2013. "The State of the Poor: Where Are The Poor, Where Is Extreme Poverty Harder to End, and What Is the Current Profile of the World’s Poor?," World Bank - Economic Premise, The World Bank, issue 125, pages 1-8, October.
    3. Gounder, Rukmani & Xing, Zhongwei, 2012. "Impact of education and health on poverty reduction: Monetary and non-monetary evidence from Fiji," Economic Modelling, Elsevier, vol. 29(3), pages 787-794.
    4. Ben Taieb, Souhaib & Hyndman, Rob J., 2014. "A gradient boosting approach to the Kaggle load forecasting competition," International Journal of Forecasting, Elsevier, vol. 30(2), pages 382-394.
    5. Andrea Brandolini & Silvia Magri & Timothy M. Smeeding, 2010. "Asset-based measurement of poverty," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 29(2), pages 267-284.
    6. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, February.
    7. Alan B. Krueger & Jitka Maleckova, 2003. "Education, Poverty and Terrorism: Is There a Causal Connection?," Journal of Economic Perspectives, American Economic Association, vol. 17(4), pages 119-144, Fall.
    8. Shah, Salyha Zulfiqar Ali & Chaudhry, Imran Sharif & Farooq, Fatima, 2020. "Poverty Status and Factors Affecting Household Poverty in Southern Punjab: An Empirical Analysis," Journal of Business and Social Review in Emerging Economies, CSRC Publishing, Center for Sustainability Research and Consultancy Pakistan, vol. 6(2), pages 437-451, June.
    9. Marlous Milliano & Ilze Plavgo, 2018. "Analysing Multidimensional Child Poverty in Sub-Saharan Africa: Findings Using an International Comparative Approach," Child Indicators Research, Springer;The International Society of Child Indicators (ISCI), vol. 11(3), pages 805-833, June.
    10. Luc Christiaensen & Peter Lanjouw & Jill Luoto & David Stifel, 2012. "Small area estimation-based prediction methods to track poverty: validation and applications," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 10(2), pages 267-297, June.
    11. Linden McBride & Christopher B. Barrett & Christopher Browne & Leiqiu Hu & Yanyan Liu & David S. Matteson & Ying Sun & Jiaming Wen, 2022. "Predicting poverty and malnutrition for targeting, mapping, monitoring, and early warning," Applied Economic Perspectives and Policy, John Wiley & Sons, vol. 44(2), pages 879-892, June.
    12. Susan Athey, 2018. "The Impact of Machine Learning on Economics," NBER Chapters, in: The Economics of Artificial Intelligence: An Agenda, pages 507-547, National Bureau of Economic Research, Inc.
    13. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    14. Linden McBride & Austin Nichols, 2018. "Retooling Poverty Targeting Using Out-of-Sample Validation and Machine Learning," The World Bank Economic Review, World Bank Group, vol. 32(3), pages 531-550.
    15. Christopher Yeh & Anthony Perez & Anne Driscoll & George Azzari & Zhongyi Tang & David Lobell & Stefano Ermon & Marshall Burke, 2020. "Using publicly available satellite imagery and deep learning to understand economic well-being in Africa," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    16. Hentschel, Jesko, et al, 2000. "Combining Census and Survey Data to Trace the Spatial Dimensions of Poverty: A Case Study of Ecuador," The World Bank Economic Review, World Bank Group, vol. 14(1), pages 147-165, January.
    17. Nosier, Shereen & Beram, Reham & Mahrous, Mohamed, 2021. "Household Poverty in Egypt: Poverty Profile, Econometric Modeling and Policy Simulations," SocArXiv d8spt, Center for Open Science.
    18. Aldashev, Alisher, 2019. "Social Norms, Status Spending and Household Debt: Evidence from Kyrgyzstan," MPRA Paper 91363, University Library of Munich, Germany.
    19. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    20. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gabriel Okasa, 2022. "Meta-Learners for Estimation of Causal Effects: Finite Sample Cross-Fit Performance," Papers 2201.12692,
    2. Athey, Susan & Imbens, Guido W., 2019. "Machine Learning Methods Economists Should Know About," Research Papers 3776, Stanford University, Graduate School of Business.
    3. Luigi Biagini & Simone Severini, 2021. "The role of Common Agricultural Policy (CAP) in enhancing and stabilising farm income: an analysis of income transfer efficiency and the Income Stabilisation Tool," Papers 2104.14188,
    4. Elena Ivona DUMITRESCU & Sullivan HUE & Christophe HURLIN & Sessi TOKPAVI, 2020. "Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds," LEO Working Papers / DR LEO 2839, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    5. Christoph F. Kurz & Martin Rehm & Rolf Holle & Christina Teuner & Michael Laxy & Larissa Schwarzkopf, 2019. "The effect of bariatric surgery on health care costs: A synthetic control approach using Bayesian structural time series," Health Economics, John Wiley & Sons, Ltd., vol. 28(11), pages 1293-1307, November.
    6. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854,, revised Jan 2022.
    7. Camilla Beck Olsen & Hans Olav Melberg, 2018. "Did adolescents in Norway respond to the elimination of copayments for general practitioner services?," Health Economics, John Wiley & Sons, Ltd., vol. 27(7), pages 1120-1130, July.
    8. Barzin,Samira & Avner,Paolo & Maruyama Rentschler,Jun Erik & O’Clery,Neave, 2022. "Where Are All the Jobs ? A Machine Learning Approach for High Resolution Urban Employment Prediction inDeveloping Countries," Policy Research Working Paper Series 9979, The World Bank.
    9. Maria-Carmen García-Centeno & Román Mínguez-Salido & Raúl del Pozo-Rubio, 2021. "The Classification of Profiles of Financial Catastrophe Caused by Out-of-Pocket Payments: A Methodological Approach," Mathematics, MDPI, vol. 9(11), pages 1-20, May.
    10. Colombo, Emilio & Pelagatti, Matteo, 2020. "Statistical learning and exchange rate forecasting," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1260-1289.
    11. Aziza Usmanova & Ahmed Aziz & Dilshodjon Rakhmonov & Walid Osamy, 2022. "Utilities of Artificial Intelligence in Poverty Prediction: A Review," Sustainability, MDPI, vol. 14(21), pages 1-39, October.
    12. James T. E. Chapman & Ajit Desai, 2022. "Macroeconomic Predictions using Payments Data and Machine Learning," Papers 2209.00948,
    13. Gür Ali, Özden & Gürlek, Ragıp, 2020. "Automatic Interpretable Retail forecasting with promotional scenarios," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1389-1406.
    14. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    15. Carstensen, Kai & Heinrich, Markus & Reif, Magnus & Wolters, Maik H., 2020. "Predicting ordinary and severe recessions with a three-state Markov-switching dynamic factor model," International Journal of Forecasting, Elsevier, vol. 36(3), pages 829-850.
    16. Hou-Tai Chang & Ping-Huai Wang & Wei-Fang Chen & Chen-Ju Lin, 2022. "Risk Assessment of Early Lung Cancer with LDCT and Health Examinations," IJERPH, MDPI, vol. 19(8), pages 1-12, April.
    17. Wang, Qiao & Zhou, Wei & Cheng, Yonggang & Ma, Gang & Chang, Xiaolin & Miao, Yu & Chen, E, 2018. "Regularized moving least-square method and regularized improved interpolating moving least-square method with nonsingular moment matrices," Applied Mathematics and Computation, Elsevier, vol. 325(C), pages 120-145.
    18. Lucian Belascu & Alexandra Horobet & Georgiana Vrinceanu & Consuela Popescu, 2021. "Performance Dissimilarities in European Union Manufacturing: The Effect of Ownership and Technological Intensity," Sustainability, MDPI, vol. 13(18), pages 1-19, September.
    19. Candelon, B. & Hurlin, C. & Tokpavi, S., 2012. "Sampling error and double shrinkage estimation of minimum variance portfolios," Journal of Empirical Finance, Elsevier, vol. 19(4), pages 511-527.
    20. Kim, Hyun Hak & Swanson, Norman R., 2018. "Mining big data using parsimonious factor, machine learning, variable selection and shrinkage methods," International Journal of Forecasting, Elsevier, vol. 34(2), pages 339-354.


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:soceps:v:81:y:2022:i:c:s0038012121001877. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.