Imputation Methods for Handling Item-Nonresponse in the Social Sciences: A Methodological Review
AbstractMissing data are often a problem in social science data. Imputation methods fill in the missing responses and lead, under certain conditions, to valid inference. This article reviews several imputation methods used in the social sciences and discusses advantages and disadvantages of these methods in practice. Simpler imputation methods as well as more advanced methods, such as fractional and multiple imputation, are considered. The paper introduces the reader new to the imputation literature to key ideas and methods. For those already familiar with imputation methods the paper highlights some new developments and clarifies some recent misconceptions in the use of imputation methods. The emphasis is on efficient hot deck imputation methods, implemented in either multiple or fractional imputation approaches. Software packages for using imputation methods in practice are reviewed highlighting newer developments. The paper discusses an example from the social sciences in detail, applying several imputation methods to a missing earnings variable. The objective is to illustrate how to choose between methods in a real data example. A simulation study evaluates various imputation methods, including predictive mean matching, fractional and multiple imputation. Certain forms of fractional and multiple hot deck methods are found to perform well with regards to bias and efficiency of a point estimator and robustness against model misspecifications. Standard parametric imputation methods are not found adequate for the application considered.[NCRM WP]
Download InfoIf you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
Bibliographic InfoPaper provided by eSocialSciences in its series Working Papers with number id:2007.
Date of creation: Jun 2009
Date of revision:
Note: Institutional Papers
Contact details of provider:
Web page: http://www.esocialsciences.org
item-nonresponse; imputation; fractional imputation; multiple imputation; estimation of distribution functions;
This paper has been announced in the following NEP Reports:
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- John B. Carlin & Ning Li & Philip Greenwood & Carolyn Coffey, 2003. "Tools for analyzing multiple imputed datasets," Stata Journal, StataCorp LP, vol. 3(3), pages 226-244, September.
- Schenker, Nathaniel & Taylor, Jeremy M. G., 1996. "Partially parametric techniques for multiple imputation," Computational Statistics & Data Analysis, Elsevier, vol. 22(4), pages 425-446, August.
- Hirsch, Barry & Schumacher, Edward J., 2003.
"Match Bias in Wage Gap Estimates Due to Earnings Imputation,"
IZA Discussion Papers
783, Institute for the Study of Labor (IZA).
- Barry T. Hirsch & Edward J. Schumacher, 2004. "Match Bias in Wage Gap Estimates Due to Earnings Imputation," Journal of Labor Economics, University of Chicago Press, vol. 22(3), pages 689-722, July.
- Joseph G. Ibrahim & Ming-Hui Chen & Stuart R. Lipsitz & Amy H. Herring, 2005. "Missing-Data Methods for Generalized Linear Models: A Comparative Review," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 332-346, March.
- Lee Lillard & James P. Smith & Finis Welch, 2004.
"What Do We Really Know About Wages: The Importance of Nonreporting and Census Imputation,"
Labor and Demography
- Lillard, Lee & Smith, James P & Welch, Finis, 1986. "What Do We Really Know about Wages? The Importance of Nonreporting and Census Imputation," Journal of Political Economy, University of Chicago Press, vol. 94(3), pages 489-506, June.
- J. K. Kim, 2002. "A note on approximate Bayesian bootstrap imputation," Biometrika, Biometrika Trust, vol. 89(2), pages 470-477, June.
- Little, Roderick J A, 1988. "Missing-Data Adjustments in Large Surveys," Journal of Business & Economic Statistics, American Statistical Association, vol. 6(3), pages 287-96, July.
- Chen J. & Shao J., 2001. "Jackknife Variance Estimation for Nearest-Neighbor Imputation," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 260-269, March.
- Horton N. J. & Lipsitz S. R., 2001. "Multiple Imputation in Practice: Comparison of Software Packages for Regression Models With Missing Variables," The American Statistician, American Statistical Association, vol. 55, pages 244-254, August.
- Jae Kwang Kim, 2004. "Fractional hot deck imputation," Biometrika, Biometrika Trust, vol. 91(3), pages 559-578, September.
- Patrick Royston, 2004. "Multiple imputation of missing values," Stata Journal, StataCorp LP, vol. 4(3), pages 227-241, September.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Padma Prakash).
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If references are entirely missing, you can add them using this form.
If the full references list an item that is present in RePEc, but the system did not link to it, you can help with this form.
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your profile, as there may be some citations waiting for confirmation.
Please note that corrections may take a couple of weeks to filter through the various RePEc services.