Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution
This paper deals with the question of selectivity of missing data on income questions in large panel surveys due to item-non-response and with imputation as one alternative strategy to cope with this issue. In contrast to cross-section surveys, the imputation of missing values in panel data can profit from longitudinal information which is available for the very same observation units from other points in time. The “row-and-column imputation procedure” developed by Little & Su (1989) considers longitudinal as well as cross-sectional information in the imputation process. This procedure is applied to the German Socio-Economic Panel study (SOEP) when deriving annual income variables, complemented by purely cross-sectional techniques. Based on the SOEP, our empirical work starts with a description of the overall incidence of imputation and its relevance given by imputed income as a percentage share of the total income mass: e.g. while 21 % of all observations have at least one missing income component of their pre-tax post-transfer income, 9 % of the overall income mass is imputed. However, this picture varies considerably for more recent sub-samples of the panel survey. Secondly, we analyze the respective impact of imputation on the personal distribution of income as well as on results of income mobility. When comparing income inequality measures based only on truly observed information to those derived from all (i.e., observed and imputed) observations, we find an increase in inequality due to imputation and this effect appears to be relevant in both tails of the distribution, although somewhat more prominent among higher incomes. Longitudinal analyses show firstly a positive correlation of item-non-response on income data over time, but also provide evidence of item-non-response as being a predictor of subsequent unit-non-response. Applying various income mobility indicators there is a robust picture about income mobility being understated using truly observed information only. Finally, multivariate models show that survey-related factors (number of interviews, interview mode) as well as indicators for variability in income receipt (due to increased complexity of household structure and income composition) are significantly correlated with item-non-response. In conclusion, our empirical results based on the German SOEP indicate the selectivity of item-non-response on income questions in social surveys and push the necessity for adequate imputation.
|Date of creation:||2003|
|Date of revision:|
|Contact details of provider:|| Postal: |
Web page: http://www.diw.de/en
More information through EDIRC
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Jürgen Schupp & Gert G. Wagner, 2002. "Maintenance of and Innovation in Long-Term Panel Studies: The Case of the German Socio-Economic Panel (GSOEP)," Discussion Papers of DIW Berlin 276, DIW Berlin, German Institute for Economic Research.
- Regina Riphahn & Oliver Serfling, 2005.
"Item non-response on income and wealth questions,"
Springer, vol. 30(2), pages 521-538, 09.
- Fields, Gary S & Ok, Efe A, 1999. "Measuring Movement of Incomes," Economica, London School of Economics and Political Science, vol. 66(264), pages 455-71, November.
- Daniel H. Hill & Robert J. Willis, 2001. "Reducing Panel Attrition: A Search for Effective Policy Instruments," Journal of Human Resources, University of Wisconsin Press, vol. 36(3), pages 416-438.
When requesting a correction, please mention this item's handle: RePEc:diw:diwwpp:dp376. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Bibliothek)
If references are entirely missing, you can add them using this form.