IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v100y2009i9p1900-1918.html
   My bibliography  Save this article

Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis

Author

Listed:
  • Yuan, Ke-Hai

Abstract

When missing data are either missing completely at random (MCAR) or missing at random (MAR), the maximum likelihood (ML) estimation procedure preserves many of its properties. However, in any statistical modeling, the distribution specification for the likelihood function is at best only an approximation to the real world. In particular, since the normal-distribution-based ML is typically applied to data with heterogeneous marginal skewness and kurtosis, it is necessary to know whether such a practice still generates consistent parameter estimates. When the manifest variables are linear combinations of independent random components and missing data are MAR, this paper shows that the normal-distribution-based MLE is consistent regardless of the distribution of the sample. Examples also show that the consistency of the MLE is not guaranteed for all nonnormally distributed samples. When the population follows a confirmatory factor model, and data are missing due to the magnitude of the factors, the MLE may not be consistent even when data are normally distributed. When data are missing due to the magnitude of measurement errors/uniqueness, MLEs for many of the covariance parameters related to the missing variables are still consistent. This paper also identifies and discusses the factors that affect the asymptotic biases of the MLE when data are not missing at random. In addition, the paper also shows that, under certain data models and MAR mechanism, the MLE is asymptotically normally distributed and the asymptotic covariance matrix is consistently estimated by the commonly used sandwich-type covariance matrix. The results indicate that certain formulas and/or conclusions in the existing literature may not be entirely correct.

Suggested Citation

  • Yuan, Ke-Hai, 2009. "Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis," Journal of Multivariate Analysis, Elsevier, vol. 100(9), pages 1900-1918, October.
  • Handle: RePEc:eee:jmvana:v:100:y:2009:i:9:p:1900-1918
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047-259X(09)00107-9
    Download Restriction: Full text for ScienceDirect subscribers only
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. C. Hendricks Brown, 1983. "Asymptotic comparison of missing data procedures for estimating factor loadings," Psychometrika, Springer;The Psychometric Society, vol. 48(2), pages 269-291, June.
    2. Geert Molenberghs & Caroline Beunckens & Cristina Sotto & Michael G. Kenward, 2008. "Every missingness not at random model has a missingness at random counterpart with equal fit," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(2), pages 371-388, April.
    3. Amemiya, Takeshi, 1973. "Regression Analysis when the Dependent Variable is Truncated Normal," Econometrica, Econometric Society, vol. 41(6), pages 997-1016, November.
    4. Xin-Yuan Song & Sik-Yum Lee, 2002. "Analysis of structural equation model with ignorable missing continuous and polytomous data," Psychometrika, Springer;The Psychometric Society, vol. 67(2), pages 261-288, June.
    5. Carl Finkbeiner, 1979. "Estimation for the multiple factor model when data are missing," Psychometrika, Springer;The Psychometric Society, vol. 44(4), pages 409-420, December.
    6. Sik-Yum Lee, 1986. "Estimation for structural equation models with missing data," Psychometrika, Springer;The Psychometric Society, vol. 51(1), pages 93-99, March.
    7. Tang, Man-Lai & Bentler, Peter M., 1998. "Theory and method for constrained estimation in structural equation models with incomplete data," Computational Statistics & Data Analysis, Elsevier, vol. 27(3), pages 257-270, May.
    8. White, Halbert, 1982. "Maximum Likelihood Estimation of Misspecified Models," Econometrica, Econometric Society, vol. 50(1), pages 1-25, January.
    9. Gourieroux, Christian & Monfort, Alain & Trognon, Alain, 1984. "Pseudo Maximum Likelihood Methods: Theory," Econometrica, Econometric Society, vol. 52(3), pages 681-700, May.
    10. Roderick J. A. Little, 1988. "Robust Estimation of the Mean and Covariance Matrix from Data with Missing Values," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 37(1), pages 23-38, March.
    11. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    12. Yuan, Ke-Hai & Jennrich, Robert I., 1998. "Asymptotics of Estimating Equations under Natural Conditions," Journal of Multivariate Analysis, Elsevier, vol. 65(2), pages 245-260, May.
    13. Yuan, Ke-Hai, 1997. "A Theorem on Uniform Convergence of Stochastic Functions with Applications," Journal of Multivariate Analysis, Elsevier, vol. 62(1), pages 100-109, July.
    14. Bengt Muthén & David Kaplan & Michael Hollis, 1987. "On structural equation modeling with data that are not missing completely at random," Psychometrika, Springer;The Psychometric Society, vol. 52(3), pages 431-462, September.
    15. Liu, Chuanhai, 1997. "ML Estimation of the MultivariatetDistribution and the EM Algorithm," Journal of Multivariate Analysis, Elsevier, vol. 63(2), pages 296-312, November.
    16. Kevin Kim & Peter Bentler, 2002. "Tests of homogeneity of means and covariance matrices for multivariate incomplete data," Psychometrika, Springer;The Psychometric Society, vol. 67(4), pages 609-623, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Richard M. Golden & Steven S. Henley & Halbert White & T. Michael Kashner, 2019. "Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data," Econometrics, MDPI, vol. 7(3), pages 1-27, September.
    2. Kano, Yutaka & Takai, Keiji, 2011. "Analysis of NMAR missing data without specifying missing-data mechanisms in a linear latent variate model," Journal of Multivariate Analysis, Elsevier, vol. 102(9), pages 1241-1255, October.
    3. Dursun Aydın & Ersin Yılmaz, 2021. "Semiparametric modeling of the right-censored time-series based on different censorship solution techniques," Empirical Economics, Springer, vol. 61(4), pages 2143-2172, October.
    4. Yuan, Ke-Hai & Savalei, Victoria, 2014. "Consistency, bias and efficiency of the normal-distribution-based MLE: The role of auxiliary variables," Journal of Multivariate Analysis, Elsevier, vol. 124(C), pages 353-370.
    5. Ke-Hai Yuan & Mortaza Jamshidian & Yutaka Kano, 2018. "Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances," Psychometrika, Springer;The Psychometric Society, vol. 83(2), pages 425-442, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tang, Man-Lai & Bentler, Peter M., 1998. "Theory and method for constrained estimation in structural equation models with incomplete data," Computational Statistics & Data Analysis, Elsevier, vol. 27(3), pages 257-270, May.
    2. repec:gnv:wpaper:unige:76321 is not listed on IDEAS
    3. Patrick Gagliardini & Elisa Ossola & Olivier Scaillet, 2016. "Time‐Varying Risk Premium in Large Cross‐Sectional Equity Data Sets," Econometrica, Econometric Society, vol. 84, pages 985-1046, May.
    4. Schwiebert, Jörg & Wagner, Joachim, 2015. "A Generalized Two-Part Model for Fractional Response Variables with Excess Zeros," VfS Annual Conference 2015 (Muenster): Economic Development - Theory and Policy 113059, Verein für Socialpolitik / German Economic Association.
    5. Ke-Hai Yuan & Wai Chan & Yubin Tian, 2016. "Expectation-robust algorithm and estimating equations for means and dispersion matrix with missing data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 68(2), pages 329-351, April.
    6. Tang, Man-Lai & Lee, Sik-Yum, 1998. "Analysis of structural equation models with censored or truncated data via EM algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 27(1), pages 33-46, March.
    7. Song, Weixing & Zhang, Yi, 2012. "Empirical L2-distance lack-of-fit tests for Tobit regression models," Journal of Multivariate Analysis, Elsevier, vol. 111(C), pages 380-396.
    8. Ke-Hai Yuan & Linda Marshall & Peter Bentler, 2002. "A unified approach to exploratory factor analysis with missing data, nonnormal data, and in the presence of outliers," Psychometrika, Springer;The Psychometric Society, vol. 67(1), pages 95-121, March.
    9. Song, Weixing & Yao, Weixin, 2011. "A lack-of-fit test in Tobit errors-in-variables regression models," Statistics & Probability Letters, Elsevier, vol. 81(12), pages 1792-1801.
    10. Ke-Hai Yuan & Zhiyong Zhang, 2012. "Robust Structural Equation Modeling with Missing Data and Auxiliary Variables," Psychometrika, Springer;The Psychometric Society, vol. 77(4), pages 803-826, October.
    11. Koul, Hira L. & Song, Weixing & Liu, Shan, 2014. "Model checking in Tobit regression via nonparametric smoothing," Journal of Multivariate Analysis, Elsevier, vol. 125(C), pages 36-49.
    12. Richard M. Golden & Steven S. Henley & Halbert White & T. Michael Kashner, 2019. "Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data," Econometrics, MDPI, vol. 7(3), pages 1-27, September.
    13. Arvid Raknerud, 2002. "Identification, Estimation and Testing in Panel Data Models with Attrition: The Role of the Missing at Random Assumption," Discussion Papers 330, Statistics Norway, Research Department.
    14. Fernando Rios-Avila & Gustavo Canavire-Bacarreza, 2018. "Standard-error correction in two-stage optimization models: A quasi–maximum likelihood estimation approach," Stata Journal, StataCorp LP, vol. 18(1), pages 206-222, March.
    15. Broze, Laurence & Gourieroux, Christian, 1998. "Pseudo-maximum likelihood method, adjusted pseudo-maximum likelihood method and covariance estimators," Journal of Econometrics, Elsevier, vol. 85(1), pages 75-98, July.
    16. Magnus, Jan R., 2007. "The Asymptotic Variance Of The Pseudo Maximum Likelihood Estimator," Econometric Theory, Cambridge University Press, vol. 23(5), pages 1022-1032, October.
    17. Silva João M. C. Santos & Tenreyro Silvana & Windmeijer Frank, 2015. "Testing Competing Models for Non-negative Data with Many Zeros," Journal of Econometric Methods, De Gruyter, vol. 4(1), pages 1-18, January.
    18. de Rassenfosse, Gaétan & Schoen, Anja & Wastyn, Annelies, 2014. "Selection bias in innovation studies: A simple test," Technological Forecasting and Social Change, Elsevier, vol. 81(C), pages 287-299.
    19. Hagmann, M. & Scaillet, O., 2007. "Local multiplicative bias correction for asymmetric kernel density estimators," Journal of Econometrics, Elsevier, vol. 141(1), pages 213-249, November.
    20. Smith, V. Kerry & Mansfield, Carol, 1998. "Buying Time: Real and Hypothetical Offers," Journal of Environmental Economics and Management, Elsevier, vol. 36(3), pages 209-224, November.
    21. Sik-Yum Lee, 2006. "Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data," Psychometrika, Springer;The Psychometric Society, vol. 71(3), pages 541-564, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:100:y:2009:i:9:p:1900-1918. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.