Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis
AbstractWhen missing data are either missing completely at random (MCAR) or missing at random (MAR), the maximum likelihood (ML) estimation procedure preserves many of its properties. However, in any statistical modeling, the distribution specification for the likelihood function is at best only an approximation to the real world. In particular, since the normal-distribution-based ML is typically applied to data with heterogeneous marginal skewness and kurtosis, it is necessary to know whether such a practice still generates consistent parameter estimates. When the manifest variables are linear combinations of independent random components and missing data are MAR, this paper shows that the normal-distribution-based MLE is consistent regardless of the distribution of the sample. Examples also show that the consistency of the MLE is not guaranteed for all nonnormally distributed samples. When the population follows a confirmatory factor model, and data are missing due to the magnitude of the factors, the MLE may not be consistent even when data are normally distributed. When data are missing due to the magnitude of measurement errors/uniqueness, MLEs for many of the covariance parameters related to the missing variables are still consistent. This paper also identifies and discusses the factors that affect the asymptotic biases of the MLE when data are not missing at random. In addition, the paper also shows that, under certain data models and MAR mechanism, the MLE is asymptotically normally distributed and the asymptotic covariance matrix is consistently estimated by the commonly used sandwich-type covariance matrix. The results indicate that certain formulas and/or conclusions in the existing literature may not be entirely correct.
Download InfoIf you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
As the access to this document is restricted, you may want to look for a different version under "Related research" (further below) or search for a different version of it.
Bibliographic InfoArticle provided by Elsevier in its journal Journal of Multivariate Analysis.
Volume (Year): 100 (2009)
Issue (Month): 9 (October)
Contact details of provider:
Web page: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Gourieroux, Christian & Monfort, Alain & Trognon, Alain, 1984.
"Pseudo Maximum Likelihood Methods: Theory,"
Econometric Society, vol. 52(3), pages 681-700, May.
- Tang, Man-Lai & Bentler, Peter M., 1998. "Theory and method for constrained estimation in structural equation models with incomplete data," Computational Statistics & Data Analysis, Elsevier, vol. 27(3), pages 257-270, May.
- Sik-Yum Lee, 1986. "Estimation for structural equation models with missing data," Psychometrika, Springer, vol. 51(1), pages 93-99, March.
- Xin-Yuan Song & Sik-Yum Lee, 2002. "Analysis of structural equation model with ignorable missing continuous and polytomous data," Psychometrika, Springer, vol. 67(2), pages 261-288, June.
- Carl Finkbeiner, 1979. "Estimation for the multiple factor model when data are missing," Psychometrika, Springer, vol. 44(4), pages 409-420, December.
- White, Halbert, 1982. "Maximum Likelihood Estimation of Misspecified Models," Econometrica, Econometric Society, vol. 50(1), pages 1-25, January.
- Heckman, James, 2013.
"Sample selection bias as a specification error,"
Publishing House "SINERGIA PRESS", vol. 31(3), pages 129-137.
- Amemiya, Takeshi, 1973. "Regression Analysis when the Dependent Variable is Truncated Normal," Econometrica, Econometric Society, vol. 41(6), pages 997-1016, November.
- Yuan, Ke-Hai & Jennrich, Robert I., 1998. "Asymptotics of Estimating Equations under Natural Conditions," Journal of Multivariate Analysis, Elsevier, vol. 65(2), pages 245-260, May.
- Bengt Muthén & David Kaplan & Michael Hollis, 1987. "On structural equation modeling with data that are not missing completely at random," Psychometrika, Springer, vol. 52(3), pages 431-462, September.
- C. Hendricks Brown, 1983. "Asymptotic comparison of missing data procedures for estimating factor loadings," Psychometrika, Springer, vol. 48(2), pages 269-291, June.
- Kevin Kim & Peter Bentler, 2002. "Tests of homogeneity of means and covariance matrices for multivariate incomplete data," Psychometrika, Springer, vol. 67(4), pages 609-623, December.
- Kano, Yutaka & Takai, Keiji, 2011. "Analysis of NMAR missing data without specifying missing-data mechanisms in a linear latent variate model," Journal of Multivariate Analysis, Elsevier, vol. 102(9), pages 1241-1255, October.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Wendy Shamier).
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If references are entirely missing, you can add them using this form.
If the full references list an item that is present in RePEc, but the system did not link to it, you can help with this form.
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your profile, as there may be some citations waiting for confirmation.
Please note that corrections may take a couple of weeks to filter through the various RePEc services.