IDEAS home Printed from https://ideas.repec.org/p/ese/cempwp/cempa9-25.html
   My bibliography  Save this paper

Machine learning regionalisation of input data for microsimulation models: An application of a hybrid GBM / IPF method to build a tax-benefit model for the Essex region in the UK

Author

Listed:
  • Richiardi, Matteo
  • Rejoice, Frimpong

Abstract

Development of microsimulation models often requires reweighting some input dataset to reflect the characteristics of a different population of interest. In this paper we explore a machine learning approach whereas a variant of decision trees (Gradient Boosted Machine) is used to replicate the joint distribution of target variables observed in a large commercially available but slightly biased dataset, with an additional raking step to remove the bias and ensure consistency of relevant marginal distributions with official statistics. The method is applied to build a regional variant of UKMOD, an open-source static tax-benefit model for the UK belonging to the EUROMOD family, with an application to the Greater Essex region in the UK.

Suggested Citation

  • Richiardi, Matteo & Rejoice, Frimpong, 2025. "Machine learning regionalisation of input data for microsimulation models: An application of a hybrid GBM / IPF method to build a tax-benefit model for the Essex region in the UK," Centre for Microsimulation and Policy Analysis Working Paper Series CEMPA9/25, Centre for Microsimulation and Policy Analysis at the Institute for Social and Economic Research.
  • Handle: RePEc:ese:cempwp:cempa9-25
    as

    Download full text from publisher

    File URL: https://www.iser.essex.ac.uk/wp-content/uploads/files/working-papers/cempa/cempa9-25.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Richard K. Crump & V. Joseph Hotz & Guido W. Imbens & Oscar A. Mitnik, 2009. "Dealing with limited overlap in estimation of average treatment effects," Biometrika, Biometrika Trust, vol. 96(1), pages 187-199.
    2. King, Gary & Zeng, Langche, 2001. "Logistic Regression in Rare Events Data," Political Analysis, Cambridge University Press, vol. 9(2), pages 137-163, January.
    3. DiNardo, John & Fortin, Nicole M & Lemieux, Thomas, 1996. "Labor Market Institutions and the Distribution of Wages, 1973-1992: A Semiparametric Approach," Econometrica, Econometric Society, vol. 64(5), pages 1001-1044, September.
    4. Leite, Walter & Zhang, Huibin & collier, zachary & Chawla, Kamal & , l.kong@ufl.edu & Lee, Yongseok & Quan, Jia & Soyoye, Olushola, 2024. "Machine Learning for Propensity Score Estimation: A Systematic Review and Reporting Guidelines," OSF Preprints gmrk7, Center for Open Science.
    5. repec:osf:osfxxx:gmrk7_v1 is not listed on IDEAS
    6. Fortin, Nicole & Lemieux, Thomas & Firpo, Sergio, 2011. "Decomposition Methods in Economics," Handbook of Labor Economics, in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 4, chapter 1, pages 1-102, Elsevier.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Victor Chernozhukov & Iván Fernández‐Val & Blaise Melly, 2013. "Inference on Counterfactual Distributions," Econometrica, Econometric Society, vol. 81(6), pages 2205-2268, November.
    2. Tymon Słoczyński, 2015. "The Oaxaca–Blinder Unexplained Component as a Treatment Effects Estimator," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 77(4), pages 588-604, August.
    3. Krolikowski, Pawel & Zabek, Mike & Coate, Patrick, 2020. "Parental proximity and earnings after job displacements," Labour Economics, Elsevier, vol. 65(C).
    4. Thomas Y. Mathä & Alessandro Porpiglia & Michael Ziegelmeyer, 2014. "Wealth differences across borders and the effect of real estate price dynamics: Evidence from two household surveys," BCL working papers 90, Central Bank of Luxembourg.
    5. Joanna Tyrowicz & Lucas van der Velde, 2017. "When the opportunity knocks: large structural shocks and gender wage gaps," GRAPE Working Papers 2, GRAPE Group for Research in Applied Economics.
    6. Leone Leonida & Marianna Marra & Sergio Scicchitano & Antonio Giangreco & Marco Biagetti, 2020. "Estimating the Wage Premium to Supervision for Middle Managers in Different Contexts: Evidence from Germany and the UK," Work, Employment & Society, British Sociological Association, vol. 34(6), pages 1004-1026, December.
    7. Töpfer, Marina, 2017. "Detailed RIF decomposition with selection: The gender pay gap in Italy," Hohenheim Discussion Papers in Business, Economics and Social Sciences 26-2017, University of Hohenheim, Faculty of Business, Economics and Social Sciences.
    8. Sergio Longobardi & Margherita Maria Pagliuca & Andrea Regoli, 2018. "Can problem-solving attitudes explain the gender gap in financial literacy? Evidence from Italian students’ data," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(4), pages 1677-1705, July.
    9. Cemal Eren Arbatlı & Quamrul H. Ashraf & Oded Galor & Marc Klemp, 2020. "Diversity and Conflict," Econometrica, Econometric Society, vol. 88(2), pages 727-797, March.
    10. Bartels, Charlotte & Sierminska, Eva & Schröder, Carsten, 2025. "Wealth creators or inheritors? Unpacking the gender wealth gap from bottom to top and young to old," Economics Letters, Elsevier, vol. 246(C).
    11. Manuel Arellano & Stéphane Bonhomme, 2017. "Quantile Selection Models With an Application to Understanding Changes in Wage Inequality," Econometrica, Econometric Society, vol. 85, pages 1-28, January.
    12. Michel Lubrano & Abdoul Aziz Junior Ndoye, 2014. "Bayesian Unconditional Quantile Regression: An Analysis of Recent Expansions in Wage Structure and Earnings Inequality in the US 1992–2009," Scottish Journal of Political Economy, Scottish Economic Society, vol. 61(2), pages 129-153, May.
    13. Calvo,Paula Andrea & Lopez-Calva,Luis-Felipe & Posadas,Josefina, 2015. "A decade of declining earnings inequality in the Russian Federation," Policy Research Working Paper Series 7392, The World Bank.
    14. Sloczynski, Tymon, 2013. "Population Average Gender Effects," IZA Discussion Papers 7315, Institute of Labor Economics (IZA).
    15. Fitzenberger Bernd & Sommerfeld Katrin, 2016. "A Sequential Decomposition of the Drop in Collective Bargaining Coverage," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(1), pages 37-69, February.
    16. Eva Militaru & Madalina Ecaterina Popescu & Amalia Cristescu & Maria Denisa Vasilescu, 2019. "Assessing Minimum Wage Policy Implications upon Income Inequalities. The Case of Romania," Sustainability, MDPI, vol. 11(9), pages 1-20, May.
    17. Ramos, Raul & Sanromá, Esteban & Simón, Hipólito, 2022. "Collective bargaining levels, employment and wage inequality in Spain," Journal of Policy Modeling, Elsevier, vol. 44(2), pages 375-395.
    18. Pérez, Carlos & Martín-Román, Ángel & Moral, Alfonso, 2020. "Two decades of the complementary leisure effect in Spain," The Journal of the Economics of Ageing, Elsevier, vol. 15(C).
    19. Hennig, Jan-Luca & Stadler, Balazs, 2021. "Firm-specific pay premiums and the gender wage gap in 21 European countries," VfS Annual Conference 2021 (Virtual Conference): Climate Economics 242354, Verein für Socialpolitik / German Economic Association.
    20. Luis Ayala & Javier Martín‐Román & Juan Vicente, 2024. "What contributes to rising inequality in large cities?," Journal of Regional Science, Wiley Blackwell, vol. 64(5), pages 1760-1810, November.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ese:cempwp:cempa9-25. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Jonathan Nears (email available below). General contact details of provider: https://edirc.repec.org/data/rcessuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.