IDEAS home Printed from https://ideas.repec.org/p/ese/cempwp/cempa9-25.html

Machine learning regionalisation of input data for microsimulation models: An application of a hybrid GBM / IPF method to build a tax-benefit model for the Essex region in the UK

Author

Listed:
  • Richiardi, Matteo
  • Rejoice, Frimpong

Abstract

Development of microsimulation models often requires reweighting some input dataset to reflect the characteristics of a different population of interest. In this paper we explore a machine learning approach whereas a variant of decision trees (Gradient Boosted Machine) is used to replicate the joint distribution of target variables observed in a large commercially available but slightly biased dataset, with an additional raking step to remove the bias and ensure consistency of relevant marginal distributions with official statistics. The method is applied to build a regional variant of UKMOD, an open-source static tax-benefit model for the UK belonging to the EUROMOD family, with an application to the Greater Essex region in the UK.

Suggested Citation

  • Richiardi, Matteo & Rejoice, Frimpong, 2025. "Machine learning regionalisation of input data for microsimulation models: An application of a hybrid GBM / IPF method to build a tax-benefit model for the Essex region in the UK," Centre for Microsimulation and Policy Analysis Working Paper Series CEMPA9/25, Centre for Microsimulation and Policy Analysis at the Institute for Social and Economic Research.
  • Handle: RePEc:ese:cempwp:cempa9-25
    as

    Download full text from publisher

    File URL: https://www.iser.essex.ac.uk/wp-content/uploads/files/working-papers/cempa/cempa9-25.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. King, Gary & Zeng, Langche, 2001. "Logistic Regression in Rare Events Data," Political Analysis, Cambridge University Press, vol. 9(2), pages 137-163, January.
    2. Richard K. Crump & V. Joseph Hotz & Guido W. Imbens & Oscar A. Mitnik, 2009. "Dealing with limited overlap in estimation of average treatment effects," Biometrika, Biometrika Trust, vol. 96(1), pages 187-199.
    3. DiNardo, John & Fortin, Nicole M & Lemieux, Thomas, 1996. "Labor Market Institutions and the Distribution of Wages, 1973-1992: A Semiparametric Approach," Econometrica, Econometric Society, vol. 64(5), pages 1001-1044, September.
    4. Leite, Walter & Zhang, Huibin & collier, zachary & Chawla, Kamal & , l.kong@ufl.edu & Lee, Yongseok & Quan, Jia & Soyoye, Olushola, 2024. "Machine Learning for Propensity Score Estimation: A Systematic Review and Reporting Guidelines," OSF Preprints gmrk7, Center for Open Science.
    5. repec:osf:osfxxx:gmrk7_v1 is not listed on IDEAS
    6. Fortin, Nicole & Lemieux, Thomas & Firpo, Sergio, 2011. "Decomposition Methods in Economics," Handbook of Labor Economics, in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 4, chapter 1, pages 1-102, Elsevier.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Victor Chernozhukov & Iván Fernández‐Val & Blaise Melly, 2013. "Inference on Counterfactual Distributions," Econometrica, Econometric Society, vol. 81(6), pages 2205-2268, November.
    2. Tymon Słoczyński, 2015. "The Oaxaca–Blinder Unexplained Component as a Treatment Effects Estimator," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 77(4), pages 588-604, August.
    3. Thomas Y. Mathä & Alessandro Porpiglia & Michael Ziegelmeyer, 2014. "Wealth differences across borders and the effect of real estate price dynamics: Evidence from two household surveys," BCL working papers 90, Central Bank of Luxembourg.
    4. Leone Leonida & Marianna Marra & Sergio Scicchitano & Antonio Giangreco & Marco Biagetti, 2020. "Estimating the Wage Premium to Supervision for Middle Managers in Different Contexts: Evidence from Germany and the UK," Work, Employment & Society, British Sociological Association, vol. 34(6), pages 1004-1026, December.
    5. Töpfer, Marina, 2017. "Detailed RIF decomposition with selection: The gender pay gap in Italy," Hohenheim Discussion Papers in Business, Economics and Social Sciences 26-2017, University of Hohenheim, Faculty of Business, Economics and Social Sciences.
    6. Sergio Longobardi & Margherita Maria Pagliuca & Andrea Regoli, 2018. "Can problem-solving attitudes explain the gender gap in financial literacy? Evidence from Italian students’ data," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(4), pages 1677-1705, July.
    7. Calvo,Paula Andrea & Lopez-Calva,Luis-Felipe & Posadas,Josefina, 2015. "A decade of declining earnings inequality in the Russian Federation," Policy Research Working Paper Series 7392, The World Bank.
    8. Sloczynski, Tymon, 2013. "Population Average Gender Effects," IZA Discussion Papers 7315, Institute of Labor Economics (IZA).
    9. Fitzenberger Bernd & Sommerfeld Katrin, 2016. "A Sequential Decomposition of the Drop in Collective Bargaining Coverage," Journal of Economics and Statistics (Jahrbuecher fuer Nationaloekonomie und Statistik), De Gruyter, vol. 236(1), pages 37-69, February.
    10. Eva Militaru & Madalina Ecaterina Popescu & Amalia Cristescu & Maria Denisa Vasilescu, 2019. "Assessing Minimum Wage Policy Implications upon Income Inequalities. The Case of Romania," Sustainability, MDPI, vol. 11(9), pages 1-20, May.
    11. Ramos, Raul & Sanromá, Esteban & Simón, Hipólito, 2022. "Collective bargaining levels, employment and wage inequality in Spain," Journal of Policy Modeling, Elsevier, vol. 44(2), pages 375-395.
    12. Pérez, Carlos & Martín-Román, Ángel & Moral, Alfonso, 2020. "Two decades of the complementary leisure effect in Spain," The Journal of the Economics of Ageing, Elsevier, vol. 15(C).
    13. Hennig, Jan-Luca & Stadler, Balazs, 2021. "Firm-specific pay premiums and the gender wage gap in 21 European countries," VfS Annual Conference 2021 (Virtual Conference): Climate Economics 242354, Verein für Socialpolitik / German Economic Association.
    14. Luis Ayala & Javier Martín‐Román & Juan Vicente, 2024. "What contributes to rising inequality in large cities?," Journal of Regional Science, Wiley Blackwell, vol. 64(5), pages 1760-1810, November.
    15. Manuel Arellano & Stéphane Bonhomme, 2017. "Quantile Selection Models With an Application to Understanding Changes in Wage Inequality," Econometrica, Econometric Society, vol. 85, pages 1-28, January.
    16. Anna Lukiyanova, 2013. "Earnings inequality and informal Employment in Russia," HSE Working papers WP BRP 37/EC/2013, National Research University Higher School of Economics.
    17. Juan Manuel del Pozo Segura, 2017. "Has the Gender Wage Gap been Reduced during the 'Peruvian Growth Miracle?' A Distributional Approach," Documentos de Trabajo / Working Papers 2017-442, Departamento de Economía - Pontificia Universidad Católica del Perú.
    18. Boris Hirsch & Philipp Lentge, 2021. "Non-Base Compensation and the Gender Pay Gap," Working Paper Series in Economics 404, University of Lüneburg, Institute of Economics.
    19. Zhu, Rong, 2016. "Wage differentials between urban residents and rural migrants in urban China during 2002–2007: A distributional analysis," China Economic Review, Elsevier, vol. 37(C), pages 2-14.
    20. Jan‐luca Hennig & Balazs Stadler, 2023. "Firm‐specific pay premiums and the gender wage gap in Europe," Post-Print hal-04171877, HAL.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ese:cempwp:cempa9-25. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Jonathan Nears (email available below). General contact details of provider: https://edirc.repec.org/data/rcessuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.