IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2601.00776.html

TWICE: Tree-based Wage Inference with Clustering and Estimation

Author

Listed:
  • Aslan Bakirov
  • Francesco Del Prato
  • Paolo Zacchia

Abstract

How much do worker skills, firm pay policies, and their interaction contribute to wage inequality? Standard approaches rely on latent fixed effects identified through worker mobility, but sparse networks inflate variance estimates, additivity assumptions rule out complementarities, and the resulting decompositions lack interpretability. We propose TWICE (Tree-based Wage Inference with Clustering and Estimation), a framework that models the conditional wage function directly from observables using gradient-boosted trees, replacing latent effects with interpretable, observable-anchored partitions. This trades off the ability to capture idiosyncratic unobservables for robustness to sampling noise and out-of-sample portability. Applied to Portuguese administrative data, TWICE outperforms linear benchmarks out of sample and reveals that sorting and non-additive interactions explain substantially more wage dispersion than implied by standard AKM estimates.

Suggested Citation

  • Aslan Bakirov & Francesco Del Prato & Paolo Zacchia, 2026. "TWICE: Tree-based Wage Inference with Clustering and Estimation," Papers 2601.00776, arXiv.org.
  • Handle: RePEc:arx:papers:2601.00776
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2601.00776
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Erling Barth & Alex Bryson & James C. Davis & Richard Freeman, 2016. "It's Where You Work: Increases in the Dispersion of Earnings across Establishments and Individuals in the United States," Journal of Labor Economics, University of Chicago Press, vol. 34(S2), pages 67-97.
    2. Harold D. Chiang & Kengo Kato & Yukun Ma & Yuya Sasaki, 2022. "Multiway Cluster Robust Double/Debiased Machine Learning," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(3), pages 1046-1056, June.
    3. David Card & Jörg Heining & Patrick Kline, 2013. "Workplace Heterogeneity and the Rise of West German Wage Inequality," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 128(3), pages 967-1015.
    4. George Baker & Michael Gibbs & Bengt Holmstrom, 1994. "The Internal Economics of the Firm: Evidence from Personnel Data," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 109(4), pages 881-919.
    5. John M. Abowd & Robert H. Creecy & Francis Kramarz, 2002. "Computing Person and Firm Effects Using Linked Longitudinal Employer-Employee Data," Longitudinal Employer-Household Dynamics Technical Papers 2002-06, Center for Economic Studies, U.S. Census Bureau.
    6. M. J. Andrews & L. Gill & T. Schank & R. Upward, 2008. "High wage workers and low wage firms: negative assortative matching or limited mobility bias?," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 171(3), pages 673-697, June.
    7. Jae Song & David J Price & Fatih Guvenen & Nicholas Bloom & Till von Wachter, 2019. "Firming Up Inequality," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 134(1), pages 1-50.
    8. Stéphane Bonhomme & Kerstin Holzheu & Thibaut Lamadon & Elena Manresa & Magne Mogstad & Bradley Setzler, 2023. "How Much Should We Trust Estimates of Firm Effects and Worker Sorting?," Journal of Labor Economics, University of Chicago Press, vol. 41(2), pages 291-322.
    9. Nunes Carolina & Carvalho Bruno P. & Pereira dos Santos João & Peralta Susana & Tavares José, 2023. "Failing Young and Temporary Workers? The Impact of a Disruptive Crisis on a Dual Labour Market," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 23(2), pages 349-395, April.
    10. John M. Abowd & Francis Kramarz & David N. Margolis, 1999. "High Wage Workers and High Wage Firms," Econometrica, Econometric Society, vol. 67(2), pages 251-334, March.
    11. Stéphane Bonhomme & Thibaut Lamadon & Elena Manresa, 2019. "A Distributional Framework for Matched Employer Employee Data," Econometrica, Econometric Society, vol. 87(3), pages 699-739, May.
    12. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    13. Daniel W. Apley & Jingyu Zhu, 2020. "Visualizing the effects of predictor variables in black box supervised learning models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(4), pages 1059-1086, September.
    14. David Card & Ana Rute Cardoso & Joerg Heining & Patrick Kline, 2018. "Firms and Labor Market Inequality: Evidence and Some Theory," Journal of Labor Economics, University of Chicago Press, vol. 36(S1), pages 13-70.
    15. Patrick Kline & Raffaele Saggio & Mikkel Sølvsten, 2020. "Leave‐Out Estimation of Variance Components," Econometrica, Econometric Society, vol. 88(5), pages 1859-1898, September.
    16. Edward P. Lazear, 2018. "Compensation and Incentives in the Workplace," Journal of Economic Perspectives, American Economic Association, vol. 32(3), pages 195-214, Summer.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bertay, Ata Can & Carreño, José & Huizinga, Harry & Uras, Burak & Vellekoop, Nathanael, 2022. "Technological change and the finance wage premium," SAFE Working Paper Series 361, Leibniz Institute for Financial Research SAFE.
    2. Kline, Patrick, 2024. "Firm wage effects," Handbook of Labor Economics,, Elsevier.
    3. Bonhomme, Stéphane & Denis, Angela, 2024. "Estimating heterogeneous effects: Applications to labor economics," Labour Economics, Elsevier, vol. 91(C).
    4. Olivier Godechot & Marco G Palladino & Damien Babet, 2023. "In the Land of AKM: Explaining the Dynamics of Wage Inequality in France," Working Papers hal-04319406, HAL.
    5. Jason Sockin, 2022. "Show Me the Amenity: Are Higher-Paying Firms Better All Around?," CESifo Working Paper Series 9842, CESifo.
    6. Engbom, Niklas & Moser, Christian & Sauermann, Jan, 2023. "Firm pay dynamics," Journal of Econometrics, Elsevier, vol. 233(2), pages 396-423.
    7. Jaime Arellano-Bover & Fernando Saltiel, 2026. "Differences in On-the-Job Learning across Firms," Journal of Labor Economics, University of Chicago Press, vol. 44(1), pages 149-188.
    8. Walters, Christopher, 2024. "Empirical Bayes methods in labor economics," Handbook of Labor Economics,, Elsevier.
    9. Jose Garcia-Louzao & Alessandro Ruggieri, 2023. "Labor Market Competition and Inequality," Bank of Lithuania Working Paper Series 117, Bank of Lithuania.
    10. Fanfani, Bernardo, 2022. "Tastes for discrimination in monopsonistic labour markets," Labour Economics, Elsevier, vol. 75(C).
    11. Roesch, Marcus & Gerritse, Michiel & Karreman, Bas & van Oort, Frank & Loog, Bart, 2025. "Do workers or firms drive the foreign acquisition wage gap?," European Economic Review, Elsevier, vol. 178(C).
    12. Eliason, Marcus & Hensvik, Lena & Kramarz, Francis & Skans, Oskar Nordström, 2023. "Social connections and the sorting of workers to firms," Journal of Econometrics, Elsevier, vol. 233(2), pages 468-506.
    13. Bonacini, Luca & Patriarca, Fabrizio & Santoni, Edoardo, 2025. "Background wage premia, beyond education: Firm sorting and unobserved abilities of graduates," Economics of Education Review, Elsevier, vol. 109(C).
    14. Benjamin Lochner & Bastian Schulz, 2024. "Firm Productivity, Wages, and Sorting," Journal of Labor Economics, University of Chicago Press, vol. 42(1), pages 85-119.
    15. Labanca, Claudio & Pozzoli, Dario, 2022. "Hours Constraints and Wage Differentials across Firms," IZA Discussion Papers 14992, IZA Network @ LISER.
    16. Huitfeldt, Ingrid & Kostøl, Andreas R. & Nimczik, Jan & Weber, Andrea, 2023. "Internal labor markets: A worker flow approach," Journal of Econometrics, Elsevier, vol. 233(2), pages 661-688.
    17. Arellano-Bover, Jaime & San, Shmuel, 2023. "The Role of Firms and Job Mobility in the Assimilation of Immigrants: Former Soviet Union Jews in Israel 1990–2019," IZA Discussion Papers 16389, IZA Network @ LISER.
    18. Annaïg Morin, 2023. "Workplace heterogeneity and wage inequality in Denmark," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(1), pages 123-133, January.
    19. Rasmus Lentz & Suphanit Piyapromdee & Jean-Marc Robin, 2022. "The Anatomy of Sorting - Evidence from Danish Data," Working Papers hal-03869383, HAL.
    20. Lachowska, Marta & Mas, Alexandre & Saggio, Raffaele & Woodbury, Stephen A., 2023. "Do firm effects drift? Evidence from Washington administrative data," Journal of Econometrics, Elsevier, vol. 233(2), pages 375-395.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2601.00776. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.