Imputation Methods for Handling Item-Nonresponse in the Social Sciences: A Methodological Review
AbstractMissing data are often a problem in social science data. Imputation methods fill in the missing responses and lead, under certain conditions, to valid inference. This article reviews several imputation methods used in the social sciences and discusses advantages and disadvantages of these methods in practice. Simpler imputation methods as well as more advanced methods, such as fractional and multiple imputation, are considered. The paper introduces the reader new to the imputation literature to key ideas and methods. For those already familiar with imputation methods the paper highlights some new developments and clarifies some recent misconceptions in the use of imputation methods. The emphasis is on efficient hot deck imputation methods, implemented in either multiple or fractional imputation approaches. Software packages for using imputation methods in practice are reviewed highlighting newer developments. The paper discusses an example from the social sciences in detail, applying several imputation methods to a missing earnings variable. The objective is to illustrate how to choose between methods in a real data example. A simulation study evaluates various imputation methods, including predictive mean matching, fractional and multiple imputation. Certain forms of fractional and multiple hot deck methods are found to perform well with regards to bias and efficiency of a point estimator and robustness against model misspecifications. Standard parametric imputation methods are not found adequate for the application considered.[NCRM WP]
Download InfoIf you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
Bibliographic InfoPaper provided by eSocialSciences in its series Working Papers with number id:2007.
Date of creation: Jun 2009
Date of revision:
Note: Institutional Papers
Contact details of provider:
Web page: http://www.esocialsciences.org
item-nonresponse; imputation; fractional imputation; multiple imputation; estimation of distribution functions;
This paper has been announced in the following NEP Reports:
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Schenker, Nathaniel & Taylor, Jeremy M. G., 1996. "Partially parametric techniques for multiple imputation," Computational Statistics & Data Analysis, Elsevier, vol. 22(4), pages 425-446, August.
- Hirsch, Barry & Schumacher, Edward J., 2003.
"Match Bias in Wage Gap Estimates Due to Earnings Imputation,"
IZA Discussion Papers
783, Institute for the Study of Labor (IZA).
- Barry T. Hirsch & Edward J. Schumacher, 2004. "Match Bias in Wage Gap Estimates Due to Earnings Imputation," Journal of Labor Economics, University of Chicago Press, vol. 22(3), pages 689-722, July.
- Lee Lillard & James P. Smith & Finis Welch, 2004.
"What Do We Really Know About Wages: The Importance of Nonreporting and Census Imputation,"
Labor and Demography
- Lillard, Lee & Smith, James P & Welch, Finis, 1986. "What Do We Really Know about Wages? The Importance of Nonreporting and Census Imputation," Journal of Political Economy, University of Chicago Press, vol. 94(3), pages 489-506, June.
- Horton N. J. & Lipsitz S. R., 2001. "Multiple Imputation in Practice: Comparison of Software Packages for Regression Models With Missing Variables," The American Statistician, American Statistical Association, vol. 55, pages 244-254, August.
- J. K. Kim, 2002. "A note on approximate Bayesian bootstrap imputation," Biometrika, Biometrika Trust, vol. 89(2), pages 470-477, June.
- Jae Kwang Kim, 2004. "Fractional hot deck imputation," Biometrika, Biometrika Trust, vol. 91(3), pages 559-578, September.
- Patrick Royston, 2004. "Multiple imputation of missing values," Stata Journal, StataCorp LP, vol. 4(3), pages 227-241, September.
- John B. Carlin & Ning Li & Philip Greenwood & Carolyn Coffey, 2003. "Tools for analyzing multiple imputed datasets," Stata Journal, StataCorp LP, vol. 3(3), pages 226-244, September.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Padma Prakash).
If references are entirely missing, you can add them using this form.