IDEAS home Printed from https://ideas.repec.org/a/sae/somere/v41y2012i4p598-629.html
   My bibliography  Save this article

ML Versus MI for Missing Data With Violation of Distribution Conditions

Author

Listed:
  • Ke-Hai Yuan
  • Fan Yang-Wallentin
  • Peter M. Bentler

Abstract

Normal-distribution-based maximum likelihood (ML) and multiple imputation (MI) are the two major procedures for missing data analysis. This article compares the two procedures with respects to bias and efficiency of parameter estimates. It also compares formula-based standard errors (SEs) for each procedure against the corresponding empirical SEs. The results indicate that parameter estimates by MI tend to be less efficient than those by ML; and the estimates of variance -covariance parameters by MI are also more biased. In particular, when the population for the observed variables possesses heavy tails, estimates of variance -covariance parameters by MI may contain severe bias even at relative large sample sizes. Although performing a lot better, ML parameter estimates may also contain substantial bias at smaller sample sizes. The results also indicate that, when the underlying population is close to normally distributed, SEs based on the sandwich-type covariance matrix and those based on the observed information matrix are very comparable to empirical SEs with either ML or MI. When the underlying distribution has heavier tails, SEs based on the sandwich-type covariance matrix for ML estimates are more reliable than those based on the observed information matrix. Both empirical results and analysis show that neither SEs based on the observed information matrix nor those based on the sandwich-type covariance matrix can provide consistent SEs in MI. Thus, ML is preferable to MI in practice, although parameter estimates by MI might still be consistent.

Suggested Citation

  • Ke-Hai Yuan & Fan Yang-Wallentin & Peter M. Bentler, 2012. "ML Versus MI for Missing Data With Violation of Distribution Conditions," Sociological Methods & Research, , vol. 41(4), pages 598-629, November.
  • Handle: RePEc:sae:somere:v:41:y:2012:i:4:p:598-629
    DOI: 10.1177/0049124112460373
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.1177/0049124112460373
    Download Restriction: no

    File URL: https://libkey.io/10.1177/0049124112460373?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Olinsky, Alan & Chen, Shaw & Harlow, Lisa, 2003. "The comparative efficacy of imputation methods for missing data in structural equation modeling," European Journal of Operational Research, Elsevier, vol. 151(1), pages 53-79, November.
    2. Yuan, Ke-Hai & Bentler, Peter M., 2010. "Consistency of Normal-Distribution-Based Pseudo Maximum Likelihood Estimates When Data Are Missing at Random," The American Statistician, American Statistical Association, vol. 64(3), pages 263-267.
    3. Horton, Nicholas J. & Kleinman, Ken P., 2007. "Much Ado About Nothing: A Comparison of Missing Data Methods and Software to Fit Incomplete Data Regression Models," The American Statistician, American Statistical Association, vol. 61, pages 79-90, February.
    4. L. Taylor & X. H. Zhou, 2009. "Multiple Imputation Methods for Treatment Noncompliance and Nonresponse in Randomized Clinical Trials," Biometrics, The International Biometric Society, vol. 65(1), pages 88-95, March.
    5. King, Gary & Honaker, James & Joseph, Anne & Scheve, Kenneth, 2001. "Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation," American Political Science Review, Cambridge University Press, vol. 95(1), pages 49-69, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ke-Hai Yuan & Wai Chan & Yubin Tian, 2016. "Expectation-robust algorithm and estimating equations for means and dispersion matrix with missing data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 68(2), pages 329-351, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhong, Hua & Hu, Wuyang, 2015. "Farmers’ Willingness to Engage in Best Management Practices: an Application of Multiple Imputation," 2015 Annual Meeting, January 31-February 3, 2015, Atlanta, Georgia 196962, Southern Agricultural Economics Association.
    2. James Honaker & Gary King, 2010. "What to Do about Missing Values in Time‐Series Cross‐Section Data," American Journal of Political Science, John Wiley & Sons, vol. 54(2), pages 561-581, April.
    3. R Florez-Lopez, 2010. "Effects of missing data in credit risk scoring. A comparative analysis of methods to achieve robustness in the absence of sufficient data," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 61(3), pages 486-501, March.
    4. Präg, Patrick, 2018. "Nonresponse to Items on Self-Reported Delinquency. A Review and Evaluation of Missing Data Techniques," SocArXiv y9sv7, Center for Open Science.
    5. Scott Gehlbach & Konstantin Sonin & Ekaterina Zhuravskaya, 2010. "Businessman Candidates," American Journal of Political Science, John Wiley & Sons, vol. 54(3), pages 718-736, July.
    6. repec:jss:jstsof:45:i04 is not listed on IDEAS
    7. Kelly, Scott & Shipworth, Michelle & Shipworth, David & Gentry, Michael & Wright, Andrew & Pollitt, Michael & Crawford-Brown, Doug & Lomas, Kevin, 2013. "Predicting the diversity of internal temperatures from the English residential sector using panel methods," Applied Energy, Elsevier, vol. 102(C), pages 601-621.
    8. Ihle, R. & Amikuzuno, J. & von Cramon-Taubadel, S. & Zorya, S., 2010. "Grenzeffekte in der Marktintegration bei Mais in Ostafrika: Einsichten aus einem semi-parametrischen Regressionsmodell," Proceedings “Schriften der Gesellschaft für Wirtschafts- und Sozialwissenschaften des Landbaues e.V.”, German Association of Agricultural Economists (GEWISOLA), vol. 45, March.
    9. Matthew Blackwell & James Honaker & Gary King, 2017. "A Unified Approach to Measurement Error and Missing Data: Overview and Applications," Sociological Methods & Research, , vol. 46(3), pages 303-341, August.
    10. Vincent Bauer & Keven Ruby & Robert Pape, 2017. "Solving the Problem of Unattributed Political Violence," Journal of Conflict Resolution, Peace Science Society (International), vol. 61(7), pages 1537-1564, August.
    11. Paul Poast, 2013. "Issue linkage and international cooperation: An empirical investigation," Conflict Management and Peace Science, Peace Science Society (International), vol. 30(3), pages 286-303, July.
    12. Cohen, Joseph N, 2010. "Neoliberalism’s relationship with economic growth in the developing world: Was it the power of the market or the resolution of financial crisis?," MPRA Paper 24527, University Library of Munich, Germany.
    13. You, Jong-Sung & Khagram, Sanjeev, 2004. "Inequality and Corruption," Working Paper Series rwp04-001, Harvard University, John F. Kennedy School of Government.
    14. Sergei Guriev & Daniel Treisman, 2020. "The Popularity of Authoritarian Leaders: A cross-national investigation," SciencePo Working papers Main hal-03878626, HAL.
    15. Louis Anthony (Tony) Cox, Jr & Douglas A. Popken, 2008. "Overcoming Confirmation Bias in Causal Attribution: A Case Study of Antibiotic Resistance Risks," Risk Analysis, John Wiley & Sons, vol. 28(5), pages 1155-1172, October.
    16. David (David Patrick) Madden, 2012. "The relationship between low birthweight and socioeconomic status in Ireland," Working Papers 201214, School of Economics, University College Dublin.
    17. Julia Cage & Yasmine Bekkouche, 2018. "The Price of a Vote: Evidence from France, 1993-2014," Working Papers hal-03393149, HAL.
    18. Bruno Versailles, 2012. "Market Integration and Border Effects in Eastern Africa," Economics Series Working Papers WPS/2012-01, University of Oxford, Department of Economics.
    19. Antonio Filippin & Luca Nunziata, 2019. "Monetary effects of inequality: lessons from the euro experiment," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 17(2), pages 99-124, June.
    20. Chen, Andrew Y. & McCoy, Jack, 2024. "Missing values handling for machine learning portfolios," Journal of Financial Economics, Elsevier, vol. 155(C).
    21. Robert Grafstein, 2009. "Antisocial Security: The Puzzle of Beggar‐Thy‐Children Policies," American Journal of Political Science, John Wiley & Sons, vol. 53(3), pages 710-725, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:somere:v:41:y:2012:i:4:p:598-629. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.