IDEAS home Printed from https://ideas.repec.org/a/bla/istatr/v89y2021i1p186-206.html
   My bibliography  Save this article

Random Effects Misspecification Can Have Severe Consequences for Random Effects Inference in Linear Mixed Models

Author

Listed:
  • Francis K. C. Hui
  • Samuel Müller
  • Alan H. Welsh

Abstract

There has been considerable and controversial research over the past two decades into how successfully random effects misspecification in mixed models (i.e. assuming normality for the random effects when the true distribution is non‐normal) can be diagnosed and what its impacts are on estimation and inference. However, much of this research has focused on fixed effects inference in generalised linear mixed models. In this article, motivated by the increasing number of applications of mixed models where interest is on the variance components, we study the effects of random effects misspecification on random effects inference in linear mixed models, for which there is considerably less literature. Our findings are surprising and contrary to general belief: for point estimation, maximum likelihood estimation of the variance components under misspecification is consistent, although in finite samples, both the bias and mean squared error can be substantial. For inference, we show through theory and simulation that under misspecification, standard likelihood ratio tests of truly non‐zero variance components can suffer from severely inflated type I errors, and confidence intervals for the variance components can exhibit considerable under coverage. Furthermore, neither of these problems vanish asymptotically with increasing the number of clusters or cluster size. These results have major implications for random effects inference, especially if the true random effects distribution is heavier tailed than the normal. Fortunately, simple graphical and goodness‐of‐fit measures of the random effects predictions appear to have reasonable power at detecting misspecification. We apply linear mixed models to a survey of more than 4 000 high school students within 100 schools and analyse how mathematics achievement scores vary with student attributes and across different schools. The application demonstrates the sensitivity of mixed model inference to the true but unknown random effects distribution.

Suggested Citation

  • Francis K. C. Hui & Samuel Müller & Alan H. Welsh, 2021. "Random Effects Misspecification Can Have Severe Consequences for Random Effects Inference in Linear Mixed Models," International Statistical Review, International Statistical Institute, vol. 89(1), pages 186-206, April.
  • Handle: RePEc:bla:istatr:v:89:y:2021:i:1:p:186-206
    DOI: 10.1111/insr.12378
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/insr.12378
    Download Restriction: no

    File URL: https://libkey.io/10.1111/insr.12378?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Schützenmeister, André & Piepho, Hans-Peter, 2012. "Residual analysis of linear mixed models using a simulation approach," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1405-1416.
    2. Hadfield, Jarrod D., 2010. "MCMC Methods for Multi-Response Generalized Linear Mixed Models: The MCMCglmm R Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i02).
    3. Verbeke, Geert & Lesaffre, Emmanuel, 1997. "The effect of misspecifying the random-effects distribution in linear mixed models for longitudinal data," Computational Statistics & Data Analysis, Elsevier, vol. 23(4), pages 541-556, February.
    4. Francis K. C. Hui & Samuel Müller & A. H. Welsh, 2017. "Joint Selection in Mixed Models using Regularized PQL," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(519), pages 1323-1333, July.
    5. Reza Drikvandi & Geert Verbeke & Geert Molenberghs, 2017. "Diagnosing misspecification of the random-effects distribution in mixed models," Biometrics, The International Biometric Society, vol. 73(1), pages 63-71, March.
    6. Vaart,A. W. van der, 2000. "Asymptotic Statistics," Cambridge Books, Cambridge University Press, number 9780521784504.
    7. Agresti, Alan & Caffo, Brian & Ohman-Strickland, Pamela, 2004. "Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies," Computational Statistics & Data Analysis, Elsevier, vol. 47(3), pages 639-653, October.
    8. Leonardo Grilli & Carla Rampichini, 2015. "Specification of random effects in multilevel models: a review," Quality & Quantity: International Journal of Methodology, Springer, vol. 49(3), pages 967-976, May.
    9. Saskia Litière & Ariel Alonso & Geert Molenberghs, 2011. "Rejoinder to “A Note on Type II Error Under Random Effects Misspecification in Generalized Linear Mixed Models”," Biometrics, The International Biometric Society, vol. 67(2), pages 656-660, June.
    10. John M. Neuhaus & Charles E. McCulloch & Ross Boylan, 2011. "A Note on Type II Error Under Random Effects Misspecification in Generalized Linear Mixed Models," Biometrics, The International Biometric Society, vol. 67(2), pages 654-656, June.
    11. Charles E. McCulloch & John M. Neuhaus, 2011. "Prediction of Random Effects in Linear and Generalized Linear Models under Model Misspecification," Biometrics, The International Biometric Society, vol. 67(1), pages 270-279, March.
    12. Jiming Jiang & P. Lahiri, 2006. "Mixed model prediction and small area estimation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 15(1), pages 1-96, June.
    13. D. B. Woodard & T. M. T. Love & S. W. Thurston & D. Ruppert & S. Sathyanarayana & S. H. Swan, 2013. "Latent factor regression models for grouped outcomes," Biometrics, The International Biometric Society, vol. 69(3), pages 785-794, September.
    14. Daowen Zhang & Marie Davidian, 2001. "Linear Mixed Models with Flexible Distributions of Random Effects for Longitudinal Data," Biometrics, The International Biometric Society, vol. 57(3), pages 795-802, September.
    15. John M. Neuhaus & Charles E. McCulloch, 2011. "Estimation of covariate effects in generalized linear mixed models with informative cluster sizes," Biometrika, Biometrika Trust, vol. 98(1), pages 147-162.
    16. Hui, Francis K.C., 2017. "Model-based simultaneous clustering and ordination of multivariate abundance data in ecology," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 1-10.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Antonello Maruotti & Pierfrancesco Alaimo Di Loro, 2023. "CO2 emissions and growth: A bivariate bidimensional mean‐variance random effects model," Environmetrics, John Wiley & Sons, Ltd., vol. 34(5), August.
    2. Shuwen Hu & You-Gan Wang & Christopher Drovandi & Taoyun Cao, 2023. "Predictions of machine learning with mixed-effects in analyzing longitudinal data under model misspecification," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(2), pages 681-711, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Freddy Hernández & Viviana Giampaoli, 2018. "The Impact of Misspecified Random Effect Distribution in a Weibull Regression Mixed Model," Stats, MDPI, vol. 1(1), pages 1-29, May.
    2. Leonardo Grilli & Carla Rampichini, 2015. "Specification of random effects in multilevel models: a review," Quality & Quantity: International Journal of Methodology, Springer, vol. 49(3), pages 967-976, May.
    3. Anders Skrondal & Sophia Rabe-Hesketh, 2022. "The Role of Conditional Likelihoods in Latent Variable Modeling," Psychometrika, Springer;The Psychometric Society, vol. 87(3), pages 799-834, September.
    4. Philip S. Boonstra & Bhramar Mukherjee & Jeremy M. G. Taylor & Mef Nilbert & Victor Moreno & Stephen B. Gruber, 2011. "Bayesian Modeling for Genetic Anticipation in Presence of Mutational Heterogeneity: A Case Study in Lynch Syndrome," Biometrics, The International Biometric Society, vol. 67(4), pages 1627-1637, December.
    5. Charles E. McCulloch & John M. Neuhaus, 2011. "Prediction of Random Effects in Linear and Generalized Linear Models under Model Misspecification," Biometrics, The International Biometric Society, vol. 67(1), pages 270-279, March.
    6. Vock, David & Davidian, Marie & Tsiatis, Anastasios, 2014. "SNP_NLMM: A SAS Macro to Implement a Flexible Random Effects Density for Generalized Linear and Nonlinear Mixed Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 56(c02).
    7. Shun Yu & Xianzheng Huang, 2017. "Random-intercept misspecification in generalized linear mixed models for binary responses," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 26(3), pages 333-359, August.
    8. Peng Zhang & Peter X.-K. Song & Annie Qu & Tom Greene, 2008. "Efficient Estimation for Patient-Specific Rates of Disease Progression Using Nonnormal Linear Mixed Models," Biometrics, The International Biometric Society, vol. 64(1), pages 29-38, March.
    9. Fei Jiang & Sebastien Haneuse, 2017. "A Semi-parametric Transformation Frailty Model for Semi-competing Risks Survival Data," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 44(1), pages 112-129, March.
    10. Ye, Rendao & Wang, Tonghui & Gupta, Arjun K., 2014. "Distribution of matrix quadratic forms under skew-normal settings," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 229-239.
    11. Huang, Xianzheng, 2011. "Detecting random-effects model misspecification via coarsened data," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 703-714, January.
    12. Warrington Nicole M. & Tilling Kate & Howe Laura D. & Paternoster Lavinia & Pennell Craig E. & Wu Yan Yan & Briollais Laurent, 2014. "Robustness of the linear mixed effects model to error distribution assumptions and the consequences for genome-wide association studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 13(5), pages 1-21, October.
    13. Daniel McNeish & Jeffrey R. Harring & Denis Dumas, 2023. "A multilevel structured latent curve model for disaggregating student and school contributions to learning," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(2), pages 545-575, June.
    14. Jacqmin-Gadda, Helene & Sibillot, Solenne & Proust, Cecile & Molina, Jean-Michel & Thiebaut, Rodolphe, 2007. "Robustness of the linear mixed model to misspecified error distribution," Computational Statistics & Data Analysis, Elsevier, vol. 51(10), pages 5142-5154, June.
    15. Li, Erning & Pourahmadi, Mohsen, 2013. "An alternative REML estimation of covariance matrices in linear mixed models," Statistics & Probability Letters, Elsevier, vol. 83(4), pages 1071-1077.
    16. Liu, Li & Xiang, Liming, 2019. "Missing covariate data in generalized linear mixed models with distribution-free random effects," Computational Statistics & Data Analysis, Elsevier, vol. 134(C), pages 1-16.
    17. Tanya P. Garcia & Yanyuan Ma, 2016. "Optimal Estimator for Logistic Model with Distribution-free Random Intercept," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(1), pages 156-171, March.
    18. Shuwen Hu & You-Gan Wang & Christopher Drovandi & Taoyun Cao, 2023. "Predictions of machine learning with mixed-effects in analyzing longitudinal data under model misspecification," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(2), pages 681-711, June.
    19. Jara, Alejandro & Quintana, Fernando & San Marti­n, Ernesto, 2008. "Linear mixed models with skew-elliptical distributions: A Bayesian approach," Computational Statistics & Data Analysis, Elsevier, vol. 52(11), pages 5033-5045, July.
    20. Reyhaneh Rikhtehgaran & Iraj Kazemi, 2013. "Semi-parametric Bayesian estimation of mixed-effects models using the multivariate skew-normal distribution," Computational Statistics, Springer, vol. 28(5), pages 2007-2027, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:istatr:v:89:y:2021:i:1:p:186-206. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/isiiinl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.