IDEAS home Printed from
   My bibliography  Save this article

Tools for analyzing multiple imputed datasets


  • John B. Carlin

    (Murdoch Children's Research Institute and University of Melbourne Department of Paediatrics)

  • Ning Li

    (Murdoch Children's Research Institute and University of Melbourne Department of Paediatrics)

  • Philip Greenwood

    (Murdoch Children's Research Institute and University of Melbourne Department of Paediatrics)

  • Carolyn Coffey

    (Murdoch Children's Research Institute and University of Melbourne Department of Paediatrics)


The method of multiple imputation (MI) is used increasingly for analyzing datasets with missing observations. Two sets of tasks are required in order to implement the method: (a) generating multiple complete datasets in which missing values have been imputed by simulating from an appropriate probability distribution and (b) analyzing the multiple imputed datasets and combining complete data inferences from them to form an overall inference for parameters of interest. An increasing number of software tools are available for task (a), although this is difficult to automate, because the method of imputation should depend on the context and available covariate data. When the quantity of missing data is not great, the sensitivity of results to the imputation model may be relatively low. In this context, software tools that enable task (b) to be performed with similar ease to the analysis of a single dataset should facilitate the wider use of multiple imputation. Such tools need not only to implement techniques for inference from multiple imputed datasets but also to allow standard manipulations such as transformation and recoding of variables. In this article, we describe a set of Stata commands that we have developed for manipulating and analyzing multiple datasets. Copyright 2003 by StataCorp LP.

Suggested Citation

  • John B. Carlin & Ning Li & Philip Greenwood & Carolyn Coffey, 2003. "Tools for analyzing multiple imputed datasets," Stata Journal, StataCorp LP, vol. 3(3), pages 226-244, September.
  • Handle: RePEc:tsj:stataj:v:3:y:2003:i:3:p:226-244

    Download full text from publisher

    File URL:
    Download Restriction: no

    File URL:
    Download Restriction: no


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Väisänen, Heini & Murphy, Michael J., 2014. "Social inequalities in teenage fertility outcomes: childbearing and abortion trends of three birth cohorts in Finland," LSE Research Online Documents on Economics 56660, London School of Economics and Political Science, LSE Library.
    2. Jue Yang & Shunsuke Managi & Masayuki Sato, 2015. "The effect of institutional quality on national wealth: an examination using multiple imputation method," Environmental Economics and Policy Studies, Springer;Society for Environmental Economics and Policy Studies - SEEPS, vol. 17(3), pages 431-453, July.
    3. John Bound & Michael F. Lovenheim & Sarah Turner, 2010. "Why Have College Completion Rates Declined? An Analysis of Changing Student Preparation and Collegiate Resources," American Economic Journal: Applied Economics, American Economic Association, vol. 2(3), pages 129-157, July.
    4. Wesley Eddings & Yulia Marchenko, 2012. "Diagnostics for multiple imputation in Stata," Stata Journal, StataCorp LP, vol. 12(3), pages 353-367, September.
    5. Patrick Royston, 2005. "Multiple imputation of missing values: update," Stata Journal, StataCorp LP, vol. 5(2), pages 188-201, June.
    6. Johannes Geyer, 2011. "The Effect of Health and Employment Risks on Precautionary Savings," SOEPpapers on Multidisciplinary Panel Data Research 408, DIW Berlin, The German Socio-Economic Panel (SOEP).
    7. Patrick Royston, 2004. "Multiple imputation of missing values," Stata Journal, StataCorp LP, vol. 4(3), pages 227-241, September.
    8. John B. Carlin & John C. Galati & Patrick Royston, 2008. "A new framework for managing and analyzing multiply imputed data in Stata," Stata Journal, StataCorp LP, vol. 8(1), pages 49-67, February.
    9. Gabriele Beissel Durrant, 2009. "Imputation Methods for Handling Item-Nonresponse in the Social Sciences: A Methodological Review," Working Papers id:2007, eSocialSciences.


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tsj:stataj:v:3:y:2003:i:3:p:226-244. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Christopher F. Baum) or (Lisa Gilmore). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.