IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2502.12967.html
   My bibliography  Save this paper

Imputation Strategies for Rightcensored Wages in Longitudinal Datasets

Author

Listed:
  • Jorg Drechsler
  • Johannes Ludsteck

Abstract

Censoring from above is a common problem with wage information as the reported wages are typically top-coded for confidentiality reasons. In administrative databases the information is often collected only up to a pre-specified threshold, for example, the contribution limit for the social security system. While directly accounting for the censoring is possible for some analyses, the most flexible solution is to impute the values above the censoring point. This strategy offers the advantage that future users of the data no longer need to implement possibly complicated censoring estimators. However, standard cross-sectional imputation routines relying on the classical Tobit model to impute right-censored data have a high risk of introducing bias from uncongeniality (Meng, 1994) as future analyses to be conducted on the imputed data are unknown to the imputer. Furthermore, as we show using a large-scale administrative database from the German Federal Employment agency, the classical Tobit model offers a poor fit to the data. In this paper, we present some strategies to address these problems. Specifically, we use leave-one-out means as suggested by Card et al. (2013) to avoid biases from uncongeniality and rely on quantile regression or left censoring to improve the model fit. We illustrate the benefits of these modeling adjustments using the German Structure of Earnings Survey, which is (almost) unaffected by censoring and can thus serve as a testbed to evaluate the imputation procedures.

Suggested Citation

  • Jorg Drechsler & Johannes Ludsteck, 2025. "Imputation Strategies for Rightcensored Wages in Longitudinal Datasets," Papers 2502.12967, arXiv.org.
  • Handle: RePEc:arx:papers:2502.12967
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2502.12967
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Christian Dustmann & Albrecht Glitz & Uta Schönberg & Herbert Brücker, 2016. "Referral-based Job Search Networks," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 83(2), pages 514-546.
    2. David Card & Jörg Heining & Patrick Kline, 2013. "Workplace Heterogeneity and the Rise of West German Wage Inequality," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 128(3), pages 967-1015.
    3. Timm Bönke & Giacomo Corneo & Holger Lüthen, 2015. "Lifetime Earnings Inequality in Germany," Journal of Labor Economics, University of Chicago Press, vol. 33(1), pages 171-208.
    4. Christian Dustmann & Uta Schönberg, 2012. "Expansions in Maternity Leave Coverage and Children's Long-Term Outcomes," American Economic Journal: Applied Economics, American Economic Association, vol. 4(3), pages 190-224, July.
    5. Koenker, Roger W & Bassett, Gilbert, Jr, 1978. "Regression Quantiles," Econometrica, Econometric Society, vol. 46(1), pages 33-50, January.
    6. Fichtenbaum, Rudy & Shahidi, Hushang, 1988. "Truncation Bias and the Measurement of Income Inequality," Journal of Business & Economic Statistics, American Statistical Association, vol. 6(3), pages 335-337, July.
    7. repec:iab:iabfme:200502(en is not listed on IDEAS
    8. Göbel, Christian & Zwick, Thomas, 2013. "Are personnel measures effective in increasing productivity of old workers?," Labour Economics, Elsevier, vol. 22(C), pages 80-93.
    9. Brücker, Herbert & Hauptmann, Andreas & Jahn, Elke J. & Upward, Richard, 2014. "Migration and imperfect labor markets: Theory and cross-country evidence from Denmark, Germany and the UK," European Economic Review, Elsevier, vol. 66(C), pages 205-225.
    10. Steven Haider & Gary Solon, 2006. "Life-Cycle Variation in the Association between Current and Lifetime Earnings," American Economic Review, American Economic Association, vol. 96(4), pages 1308-1320, September.
    11. Khan, Shakeeb & Powell, James L., 2001. "Two-step estimation of semiparametric censored regression models," Journal of Econometrics, Elsevier, vol. 103(1-2), pages 73-110, July.
    12. Uta Schönberg & Johannes Ludsteck, 2014. "Expansions in Maternity Leave Coverage and Mothers' Labor Market Outcomes after Childbirth," Journal of Labor Economics, University of Chicago Press, vol. 32(3), pages 469-505.
    13. Ham, John C & Rea, Samuel A, Jr, 1987. "Unemployment Insurance and Male Unemployment Duration in Canada," Journal of Labor Economics, University of Chicago Press, vol. 5(3), pages 325-353, July.
    14. Powell, James L., 1986. "Censored regression quantiles," Journal of Econometrics, Elsevier, vol. 32(1), pages 143-155, June.
    15. Johannes F. Schmieder & Till von Wachter & Stefan Bender, 2012. "The Effects of Extended Unemployment Insurance Over the Business Cycle: Evidence from Regression Discontinuity Estimates Over 20 Years," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 127(2), pages 701-752.
    16. John M. Abowd & Francis Kramarz & David N. Margolis, 1999. "High Wage Workers and High Wage Firms," Econometrica, Econometric Society, vol. 67(2), pages 251-334, March.
    17. Philip Armour & Richard V. Burkhauser & Jeff Larrimore, 2016. "Using The Pareto Distribution To Improve Estimates Of Topcoded Earnings," Economic Inquiry, Western Economic Association International, vol. 54(2), pages 1263-1273, April.
    18. Christian Dustmann & Johannes Ludsteck & Uta Schönberg, 2009. "Revisiting the German Wage Structure," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 124(2), pages 843-881.
    19. Johannes F. Schmieder† & Till von Wachter & Stefan Bender, 2011. "The Effects Of Extended Unemployment Insurance Over The Business Cycle: Evidence From Regression Discontinuity Estimates Over Twenty Years," Boston University - Department of Economics - Working Papers Series WP2011-063, Boston University - Department of Economics.
    20. Gartner, Hermann & Rässler, Susanne, 2005. "Analyzing the changing gender wage gap based on multiply imputed right censored wages," IAB-Discussion Paper 200505, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    21. Moshe Buchinsky & Jinyong Hahn, 1998. "An Alternative Estimator for the Censored Quantile Regression Model," Econometrica, Econometric Society, vol. 66(3), pages 653-672, May.
    22. Stephen P. Jenkins & Richard V. Burkhauser & Shuaizhang Feng & Jeff Larrimore, 2011. "Measuring inequality using censored data: a multiple‐imputation approach to estimation and inference," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 174(1), pages 63-81, January.
    23. Johannes F. Schmieder & Till von Wachter & Stefan Bender, 2016. "The Effect of Unemployment Benefits and Nonemployment Durations on Wages," American Economic Review, American Economic Association, vol. 106(3), pages 739-777, March.
    24. repec:iab:iabfda:202302(de is not listed on IDEAS
    25. Thomas Lemieux, 2006. "Increasing Residual Wage Inequality: Composition Effects, Noisy Data, or Rising Demand for Skill?," American Economic Review, American Economic Association, vol. 96(3), pages 461-498, June.
    26. Büttner, Thomas & Rässler, Susanne, 2008. "Multiple imputation of right-censored wages in the German IAB Employment Sample considering heteroscedasticity," IAB-Discussion Paper 200844, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    27. Johannes Ludsteck, 2014. "The Impact of Segregation and Sorting on the Gender Wage Gap: Evidence from German Linked Longitudinal Employer-Employee Data," ILR Review, Cornell University, ILR School, vol. 67(2), pages 362-394, April.
    28. Uwe Jensen & Hermann Gartner & Susanne Rässler, 2010. "Estimating German overqualification with stochastic earnings frontiers," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 94(1), pages 33-51, March.
    29. Hong H. & Chernozhukov V., 2002. "Three-Step Censored Quantile Regression and Extramarital Affairs," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 872-882, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel Pollmann & Thomas Dohmen & Franz Palm, 2020. "Robust Estimation of Wage Dispersion with Censored Data: An Application to Occupational Earnings Risk and Risk Attitudes," De Economist, Springer, vol. 168(4), pages 519-540, December.
    2. Chernozhukov, Victor & Fernández-Val, Iván & Kowalski, Amanda E., 2015. "Quantile regression with censoring and endogeneity," Journal of Econometrics, Elsevier, vol. 186(1), pages 201-221.
    3. Lin, Guixian & He, Xuming & Portnoy, Stephen, 2012. "Quantile regression with doubly censored data," Computational Statistics & Data Analysis, Elsevier, vol. 56(4), pages 797-812.
    4. Daniel Pollmann & Thomas Dohmen & Franz Palm, 2020. "Dispersion estimation; Earnings risk; Censoring; Quantile regression; Occupational choice; Sorting; Risk preferences; SOEP; IABS," ECONtribute Discussion Papers Series 028, University of Bonn and University of Cologne, Germany.
    5. Chen, Songnian & Wang, Qian, 2023. "Quantile regression with censoring and sample selection," Journal of Econometrics, Elsevier, vol. 234(1), pages 205-226.
    6. Fan, Yanqin & Liu, Ruixuan, 2018. "Partial identification and inference in censored quantile regression," Journal of Econometrics, Elsevier, vol. 206(1), pages 1-38.
    7. Li, Tong & Oka, Tatsushi, 2015. "Set identification of the censored quantile regression model for short panels with fixed effects," Journal of Econometrics, Elsevier, vol. 188(2), pages 363-377.
    8. Chen, Songnian, 2018. "Sequential estimation of censored quantile regression models," Journal of Econometrics, Elsevier, vol. 207(1), pages 30-52.
    9. Schmillen, Achim & Umkehrer, Matthias, 2013. "The scars of youth : effects of early-career unemployment on future unemployment experience," IAB-Discussion Paper 201306, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany].
    10. Victor Chernozhukov & Iván Fernández‐Val & Blaise Melly, 2013. "Inference on Counterfactual Distributions," Econometrica, Econometric Society, vol. 81(6), pages 2205-2268, November.
    11. Dirk Antonczyk & Thomas DeLeire & Bernd Fitzenberger, 2018. "Polarization and Rising Wage Inequality: Comparing the U.S. and Germany," Econometrics, MDPI, vol. 6(2), pages 1-33, April.
    12. Sandner, Malte, 2019. "Effects of early childhood intervention on fertility and maternal employment: Evidence from a randomized controlled trial," Journal of Health Economics, Elsevier, vol. 63(C), pages 159-181.
    13. Ji, Yonggang & Lin, Nan & Zhang, Baoxue, 2012. "Model selection in binary and tobit quantile regression using the Gibbs sampler," Computational Statistics & Data Analysis, Elsevier, vol. 56(4), pages 827-839.
    14. P. Čížek & S. Sadikoglu, 2018. "Bias-corrected quantile regression estimation of censored regression models," Statistical Papers, Springer, vol. 59(1), pages 215-247, March.
    15. Goldschmidt, Deborah & Klosterhuber, Wolfram & Schmieder, Johannes F., 2017. "Identifying couples in administrative data," Journal for Labour Market Research, Institut für Arbeitsmarkt- und Berufsforschung (IAB), Nürnberg [Institute for Employment Research, Nuremberg, Germany], vol. 50(1), pages 29-43.
    16. Christian Schluter & Mark Trede, 2024. "Spatial earnings inequality," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 22(3), pages 531-550, September.
    17. Bernd Fitzenberger & Ralf Wilke, 2006. "Using quantile regression for duration analysis," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 90(1), pages 105-120, March.
    18. Ludsteck, Johannes & Haupt, Harald, 2007. "An Empirical Test of the Reder Hypothesis," Discussion Papers in Economics 1397, University of Munich, Department of Economics.
    19. J.-M. Daussin-Benichou & A. Mauroux, 2014. "Turning the heat up. How sensitive are households to fiscal incentives on energy efficiency investments?," Documents de Travail de l'Insee - INSEE Working Papers g2014-06, Institut National de la Statistique et des Etudes Economiques.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2502.12967. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.