IDEAS home Printed from
   My bibliography  Save this paper

Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution


  • Joachim R. Frick
  • Markus M. Grabka


This paper deals with the question of selectivity of missing data on income questions in large panel surveys due to item-non-response and with imputation as one alternative strategy to cope with this issue. In contrast to cross-section surveys, the imputation of missing values in panel data can profit from longitudinal information which is available for the very same observation units from other points in time. The “row-and-column imputation procedure” developed by Little & Su (1989) considers longitudinal as well as cross-sectional information in the imputation process. This procedure is applied to the German Socio-Economic Panel study (SOEP) when deriving annual income variables, complemented by purely cross-sectional techniques. Based on the SOEP, our empirical work starts with a description of the overall incidence of imputation and its relevance given by imputed income as a percentage share of the total income mass: e.g. while 21 % of all observations have at least one missing income component of their pre-tax post-transfer income, 9 % of the overall income mass is imputed. However, this picture varies considerably for more recent sub-samples of the panel survey. Secondly, we analyze the respective impact of imputation on the personal distribution of income as well as on results of income mobility. When comparing income inequality measures based only on truly observed information to those derived from all (i.e., observed and imputed) observations, we find an increase in inequality due to imputation and this effect appears to be relevant in both tails of the distribution, although somewhat more prominent among higher incomes. Longitudinal analyses show firstly a positive correlation of item-non-response on income data over time, but also provide evidence of item-non-response as being a predictor of subsequent unit-non-response. Applying various income mobility indicators there is a robust picture about income mobility being understated using truly observed information only. Finally, multivariate models show that survey-related factors (number of interviews, interview mode) as well as indicators for variability in income receipt (due to increased complexity of household structure and income composition) are significantly correlated with item-non-response. In conclusion, our empirical results based on the German SOEP indicate the selectivity of item-non-response on income questions in social surveys and push the necessity for adequate imputation.

Suggested Citation

  • Joachim R. Frick & Markus M. Grabka, 2003. "Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution," Discussion Papers of DIW Berlin 376, DIW Berlin, German Institute for Economic Research.
  • Handle: RePEc:diw:diwwpp:dp376

    Download full text from publisher

    File URL:
    Download Restriction: no

    References listed on IDEAS

    1. Regina Riphahn & Oliver Serfling, 2005. "Item non-response on income and wealth questions," Empirical Economics, Springer, vol. 30(2), pages 521-538, September.
    2. Jürgen Schupp & Gert G. Wagner, 2002. "Maintenance of and Innovation in Long-Term Panel Studies: The Case of the German Socio-Economic Panel (GSOEP)," Discussion Papers of DIW Berlin 276, DIW Berlin, German Institute for Economic Research.
    3. Daniel H. Hill & Robert J. Willis, 2001. "Reducing Panel Attrition: A Search for Effective Policy Instruments," Journal of Human Resources, University of Wisconsin Press, vol. 36(3), pages 416-438.
    4. Fields, Gary S & Ok, Efe A, 1999. "Measuring Movement of Incomes," Economica, London School of Economics and Political Science, vol. 66(264), pages 455-471, November.
    Full references (including those not matched with items on IDEAS)


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Katja Landau & Stephan Klasen & Walter Zucchini, 2012. "Measuring Vulnerability to Poverty Using Long-Term Panel Data," Courant Research Centre: Poverty, Equity and Growth - Discussion Papers 118, Courant Research Centre PEG.
    2. Viktor Steiner & Peter Haan & Katharina Wrohlich, 2005. "Dokumentation des Steuer-Transfer-Mikrosimulationsmodells STSM 1999 - 2002," Data Documentation 9, DIW Berlin, German Institute for Economic Research.
    3. Pirmin Fessler & Peter Mooslechner & Martin Schürz & Karin Wagner, 2009. "Housing Wealth of Austrian Households," Monetary Policy & the Economy, Oesterreichische Nationalbank (Austrian Central Bank), issue 2, pages 104-124.
    4. Susanne Rässler & Regina Riphahn, 2006. "Survey item nonresponse and its treatment," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 90(1), pages 217-232, March.

    More about this item


    Item-Non-Response; Imputation; Income Inequality;

    JEL classification:

    • C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
    • D31 - Microeconomics - - Distribution - - - Personal Income and Wealth Distribution
    • I32 - Health, Education, and Welfare - - Welfare, Well-Being, and Poverty - - - Measurement and Analysis of Poverty

    NEP fields

    This paper has been announced in the following NEP Reports:


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:diw:diwwpp:dp376. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Bibliothek). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.