IDEAS home Printed from https://ideas.repec.org/a/wly/povpop/v9y2017i1p118-133.html
   My bibliography  Save this article

Is Random Forest a Superior Methodology for Predicting Poverty? An Empirical Assessment

Author

Listed:
  • Thomas Pave Sohnesen
  • Niels Stender

Abstract

Random forest (RF) is in many fields of research a common method for data‐driven predictions. Within economics and prediction of poverty, RF is rarely used. Comparing out‐of‐sample predictions in surveys for the same year in six countries shows that RF is often more accurate than current common practice (multiple imputations with variables selected by Stepwise and Lasso), suggesting that this method could contribute to better poverty predictions. However, none of the methods consistently provides accurate predictions of poverty over time, highlighting that technical model fitting by any method within a single year is not always, by itself, sufficient for accurate predictions of poverty over time.

Suggested Citation

  • Thomas Pave Sohnesen & Niels Stender, 2017. "Is Random Forest a Superior Methodology for Predicting Poverty? An Empirical Assessment," Poverty & Public Policy, John Wiley & Sons, vol. 9(1), pages 118-133, March.
  • Handle: RePEc:wly:povpop:v:9:y:2017:i:1:p:118-133
    DOI: 10.1002/pop4.169
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/pop4.169
    Download Restriction: no

    File URL: https://libkey.io/10.1002/pop4.169?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Talip Kilic & Thomas Pave Sohnesen, 2019. "Same Question But Different Answer: Experimental Evidence on Questionnaire Design's Impact on Poverty Measured by Proxies," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 65(1), pages 144-165, March.
    2. Dang,Hai-Anh H. & Lanjouw,Peter F. & Serajuddin,Umar & Dang,Hai-Anh H. & Lanjouw,Peter F. & Serajuddin,Umar, 2014. "Updating poverty estimates at frequent intervals in the absence of consumption data : methods and illustration with reference to a middle-income country," Policy Research Working Paper Series 7043, The World Bank.
    3. Astrid Mathiassen, 2013. "Testing Prediction Performance of Poverty Models: Empirical Evidence from U ganda," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 59(1), pages 91-112, March.
    4. Luc Christiaensen & Peter Lanjouw & Jill Luoto & David Stifel, 2012. "Small area estimation-based prediction methods to track poverty: validation and applications," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 10(2), pages 267-297, June.
    5. Jennifer L. Castle & Xiaochuan Qin & W. Robert Reed, 2009. "How To Pick The Best Regression Equation: A Review And Comparison Of Model Selection Algorithms," Working Papers in Economics 09/13, University of Canterbury, Department of Economics and Finance.
    6. Calogero Carletto & Talip Kilic, 2011. "Moving Up the Ladder? The Impact of Migration Experience on Occupational Mobility in Albania," Journal of Development Studies, Taylor & Francis Journals, vol. 47(6), pages 846-869.
    7. Sudarno Sumarto & Daniel Suryadarma & Asep Suryahadi, 2007. "Predicting Consumption Poverty using Non-Consumption Indicators: Experiments using Indonesian Data," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 81(3), pages 543-578, May.
    8. Hal R. Varian, 2014. "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 3-28, Spring.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Thomas Pave Sohnesen & Peter Fisker & David Malmgren‐Hansen, 2022. "Using Satellite Data to Guide Urban Poverty Reduction," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 68(S2), pages 282-294, December.
    2. Beltramo, Theresa P. & Calvi, Rossella & De Giorgi, Giacomo & Sarr, Ibrahima, 2023. "Child poverty among refugees," World Development, Elsevier, vol. 171(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pave Sohnesen,Thomas & Stender,Niels, 2016. "Is random forest a superior methodology for predicting poverty ? an empirical assessment," Policy Research Working Paper Series 7612, The World Bank.
    2. Hai‐Anh Dang & Dean Jolliffe & Calogero Carletto, 2019. "Data Gaps, Data Incomparability, And Data Imputation: A Review Of Poverty Measurement Methods For Data‐Scarce Environments," Journal of Economic Surveys, Wiley Blackwell, vol. 33(3), pages 757-797, July.
    3. Ligon, Ethan & Christiaensen, Luc & Sohnesen, Thomas P, 2020. "Should Consumption Sub-Aggregates be Used to Measure Poverty?," Department of Agricultural & Resource Economics, UC Berkeley, Working Paper Series qt9b9929jh, Department of Agricultural & Resource Economics, UC Berkeley.
    4. Hai-Anh H. Dang & Peter F. Lanjouw & Umar Serajuddin, 2017. "Updating poverty estimates in the absence of regular and comparable consumption data: methods and illustration with reference to a middle-income country," Oxford Economic Papers, Oxford University Press, vol. 69(4), pages 939-962.
    5. Dang, Hai-Anh H. & Verme, Paolo, 2019. "Estimating Poverty for Refugee Populations: Can Cross-Survey Imputation Methods Substitute for Data Scarcity?," GLO Discussion Paper Series 429, Global Labor Organization (GLO).
    6. Dang,Hai-Anh H. & Kilic,Talip & Carletto,Calogero & Abanokova,Kseniya, 2021. "Poverty Imputation in Contexts without Consumption Data : A Revisit with Further Refinements," Policy Research Working Paper Series 9838, The World Bank.
    7. Talip Kilic & Thomas Pave Sohnesen, 2019. "Same Question But Different Answer: Experimental Evidence on Questionnaire Design's Impact on Poverty Measured by Proxies," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 65(1), pages 144-165, March.
    8. Hai-Anh H. Dang & Peter F. Lanjouw, 2018. "Poverty Dynamics in India between 2004 and 2012: Insights from Longitudinal Analysis Using Synthetic Panel Data," Economic Development and Cultural Change, University of Chicago Press, vol. 67(1), pages 131-170.
    9. Abate, Gashaw T. & de Brauw, Alan & Hirvonen, Kalle & Wolle, Abdulazize, 2023. "Measuring consumption over the phone: Evidence from a survey experiment in urban Ethiopia," Journal of Development Economics, Elsevier, vol. 161(C).
    10. Jose Cuesta & Gabriel Lara Ibarra, 2017. "Comparing Cross-Survey Micro Imputation and Macro Projection Techniques: Poverty in Post Revolution Tunisia," Journal of Income Distribution, Ad libros publications inc., vol. 25(1), pages 1-30, March.
    11. Astrid Mathiassen & Bjørn K. Wold, 2019. "Challenges in predicting poverty trends using survey to survey imputation. Experiences from Malawi," Discussion Papers 900, Statistics Norway, Research Department.
    12. Jose Cuesta & Gabriel Lara Ibarra, 2018. "Comparing Cross-Survey Micro Imputation and Macro Projection Techniques: Poverty in Post Revolution Tunisia," Journal of Income Distribution, Ad libros publications inc., vol. 25(1), pages 1-30, March.
    13. Hai-Anh H. Dang & Paolo Verme, 2023. "Estimating poverty for refugees in data-scarce contexts: an application of cross-survey imputation," Journal of Population Economics, Springer;European Society for Population Economics, vol. 36(2), pages 653-679, April.
    14. Dang, Hai-Anh & Lanjouw, Peter F., 2021. "Data Scarcity and Poverty Measurement," IZA Discussion Papers 14631, Institute of Labor Economics (IZA).
    15. Betti,Gianni & Molini,Vasco & Mori,Lorenzo, 2022. "New Algorithm to Estimate Inequality Measures in Cross-Survey Imputation : An Attemptto Correct the Underestimation of Extreme Values," Policy Research Working Paper Series 10013, The World Bank.
    16. Sophie-Charlotte Klose & Johannes Lederer, 2020. "A Pipeline for Variable Selection and False Discovery Rate Control With an Application in Labor Economics," Papers 2006.12296, arXiv.org, revised Jun 2020.
    17. Patrick Bajari & Victor Chernozhukov & Ali Hortaçsu & Junichi Suzuki, 2019. "The Impact of Big Data on Firm Performance: An Empirical Investigation," AEA Papers and Proceedings, American Economic Association, vol. 109, pages 33-37, May.
    18. Nathan, Max & Rosso, Anna, 2014. "Mapping information economy businesses with big data: findings from the UK," LSE Research Online Documents on Economics 60615, London School of Economics and Political Science, LSE Library.
    19. Akash Malhotra, 2018. "A hybrid econometric-machine learning approach for relative importance analysis: Prioritizing food policy," Papers 1806.04517, arXiv.org, revised Aug 2020.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:povpop:v:9:y:2017:i:1:p:118-133. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://doi.org/10.1002/(ISSN)1944-2858 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.