IDEAS home Printed from https://ideas.repec.org/a/eee/ecosta/v18y2021icp79-88.html
   My bibliography  Save this article

A Likelihood Ratio Test of a Homoscedastic Multivariate Normal Mixture Against a Heteroscedastic Multivariate Normal Mixture

Author

Listed:
  • Cong, Lin
  • Yao, Weixin

Abstract

The multivariate finite normal mixture model is one of the most commonly used tools to analyze a heterogeneous data. When using the multivariate finite normal mixture model, one is usually interested in knowing whether a homoscedastic mixture model can be used to simplify the model. The likelihood ratio test (LRT) is the most popular statistic tool to choose between two nested models. Under the null model of a homoscedastic multivariate normal mixture, the asymptotic χ2 distribution is commonly used to approximate the null distribution of the LRT statistic. However, it is demonstrated using numerical studies that the χ2 distribution approximation is not satisfactory and fails to control the nominal type I error unless the sample size is larger than 2000, the mixture components are well-separated, and the singular solutions are avoided. A parametric bootstrap method is further proposed to approximate the distribution of the LRT statistic and its effectiveness is evaluated through extensive numerical studies.

Suggested Citation

  • Cong, Lin & Yao, Weixin, 2021. "A Likelihood Ratio Test of a Homoscedastic Multivariate Normal Mixture Against a Heteroscedastic Multivariate Normal Mixture," Econometrics and Statistics, Elsevier, vol. 18(C), pages 79-88.
  • Handle: RePEc:eee:ecosta:v:18:y:2021:i:c:p:79-88
    DOI: 10.1016/j.ecosta.2021.01.002
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S2452306221000046
    Download Restriction: Full text for ScienceDirect subscribers only. Contains open access articles

    File URL: https://libkey.io/10.1016/j.ecosta.2021.01.002?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Salvatore Ingrassia, 2004. "A likelihood-based constrained algorithm for multivariate normal mixture models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 13(2), pages 151-166, September.
    2. Supawadee Wichitchan & Weixin Yao & Guangren Yang, 2019. "A simple root selection method for univariate finite normal mixture models," Communications in Statistics - Theory and Methods, Taylor & Francis Journals, vol. 48(15), pages 3778-3794, August.
    3. Luis Angel García-Escudero & Alfonso Gordaliza & Francesca Greselin & Salvatore Ingrassia & Agustín Mayo-Iscar, 2018. "Eigenvalues and constraints in mixture modeling: geometric and computational issues," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(2), pages 203-233, June.
    4. Paul Delmar & Stéphane Robin & Diana Tronik‐Le Roux & Jean Jacques Daudin, 2005. "Mixture model on the variance for the differential analysis of gene expression data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 54(1), pages 31-50, January.
    5. Fritz, Heinrich & García-Escudero, Luis A. & Mayo-Iscar, Agustín, 2013. "A fast algorithm for robust constrained clustering," Computational Statistics & Data Analysis, Elsevier, vol. 61(C), pages 124-136.
    6. Douglas M. Hawkins, 1980. "Critical Values for Identifying Outliers," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 29(1), pages 95-96, March.
    7. Simone, Rosaria & Tutz, Gerhard & Iannario, Maria, 2020. "Subjective heterogeneity in response attitude for multivariate ordinal outcomes," Econometrics and Statistics, Elsevier, vol. 14(C), pages 145-158.
    8. Gambacciani, Marco & Paolella, Marc S., 2017. "Robust normal mixtures for financial portfolio allocation," Econometrics and Statistics, Elsevier, vol. 3(C), pages 91-111.
    9. Hanfeng Chen & Jiahua Chen & John D. Kalbfleisch, 2001. "A modified likelihood ratio test for homogeneity in finite mixture models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(1), pages 19-29.
    10. Wichitchan, Supawadee & Yao, Weixin & Yang, Guangren, 2019. "Hypothesis testing for finite mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 132(C), pages 180-189.
    11. Ingrassia, Salvatore & Rocci, Roberto, 2007. "Constrained monotone EM algorithms for finite mixture of multivariate Gaussians," Computational Statistics & Data Analysis, Elsevier, vol. 51(11), pages 5339-5351, July.
    12. G. J. McLachlan, 1987. "On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 36(3), pages 318-324, November.
    13. Quessy, Jean-François & Durocher, Martin, 2019. "The class of copulas arising from squared distributions: Properties and inference," Econometrics and Statistics, Elsevier, vol. 12(C), pages 148-166.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Roberto Rocci & Stefano Antonio Gattone & Roberto Di Mari, 2018. "A data driven equivariant approach to constrained Gaussian mixture modeling," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(2), pages 235-260, June.
    2. Derek S. Young & Xi Chen & Dilrukshi C. Hewage & Ricardo Nilo-Poyanco, 2019. "Finite mixture-of-gamma distributions: estimation, inference, and model-based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(4), pages 1053-1082, December.
    3. Andrea Cappozzo & Francesca Greselin & Thomas Brendan Murphy, 2020. "A robust approach to model-based classification based on trimming and constraints," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 327-354, June.
    4. García-Escudero, Luis Angel & Gordaliza, Alfonso & Greselin, Francesca & Ingrassia, Salvatore & Mayo-Iscar, Agustín, 2016. "The joint role of trimming and constraints in robust estimation for mixtures of Gaussian factor analyzers," Computational Statistics & Data Analysis, Elsevier, vol. 99(C), pages 131-147.
    5. Luis Angel García-Escudero & Alfonso Gordaliza & Francesca Greselin & Salvatore Ingrassia & Agustín Mayo-Iscar, 2018. "Eigenvalues and constraints in mixture modeling: geometric and computational issues," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(2), pages 203-233, June.
    6. L. García-Escudero & A. Gordaliza & A. Mayo-Iscar, 2013. "Comments on: model-based clustering and classification with non-normal mixture distributions," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 22(4), pages 459-461, November.
    7. Roberto Mari & Roberto Rocci & Stefano Antonio Gattone, 2020. "Scale-constrained approaches for maximum likelihood estimation and model selection of clusterwise linear regression models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 29(1), pages 49-78, March.
    8. Salvatore Ingrassia & Simona Minotti & Giorgio Vittadini, 2012. "Local Statistical Modeling via a Cluster-Weighted Approach with Elliptical Distributions," Journal of Classification, Springer;The Classification Society, vol. 29(3), pages 363-401, October.
    9. L. García-Escudero & A. Gordaliza & A. Mayo-Iscar, 2014. "A constrained robust proposal for mixture modeling avoiding spurious solutions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(1), pages 27-43, March.
    10. Andrews, Jeffrey L., 2018. "Addressing overfitting and underfitting in Gaussian model-based clustering," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 160-171.
    11. Wong, Tony S.T. & Lam, Kwok Fai & Zhao, Victoria X., 2018. "Asymptotic null distribution of the modified likelihood ratio test for homogeneity in finite mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 248-257.
    12. Chi, Eric C. & Lange, Kenneth, 2014. "Stable estimation of a covariance matrix guided by nuclear norm penalties," Computational Statistics & Data Analysis, Elsevier, vol. 80(C), pages 117-128.
    13. Ingrassia, Salvatore & Rocci, Roberto, 2011. "Degeneracy of the EM algorithm for the MLE of multivariate Gaussian mixtures and dynamic constraints," Computational Statistics & Data Analysis, Elsevier, vol. 55(4), pages 1715-1725, April.
    14. Lo, Yungtai, 2005. "Likelihood ratio tests of the number of components in a normal mixture with unequal variances," Statistics & Probability Letters, Elsevier, vol. 71(3), pages 225-235, March.
    15. Pietro Coretto & Christian Hennig, 2016. "Robust Improper Maximum Likelihood: Tuning, Computation, and a Comparison With Other Methods for Robust Gaussian Clustering," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1648-1659, October.
    16. Hien Nguyen & Geoffrey McLachlan, 2015. "Maximum likelihood estimation of Gaussian mixture models without matrix operations," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 9(4), pages 371-394, December.
    17. Angelo Mazza & Antonio Punzo, 2020. "Mixtures of multivariate contaminated normal regression models," Statistical Papers, Springer, vol. 61(2), pages 787-822, April.
    18. Cabral, Celso Rômulo Barbosa & Lachos, Víctor Hugo & Prates, Marcos O., 2012. "Multivariate mixture modeling using skew-normal independent distributions," Computational Statistics & Data Analysis, Elsevier, vol. 56(1), pages 126-142, January.
    19. Kasa, Siva Rajesh & Rajan, Vaibhav, 2022. "Improved Inference of Gaussian Mixture Copula Model for Clustering and Reproducibility Analysis using Automatic Differentiation," Econometrics and Statistics, Elsevier, vol. 22(C), pages 67-97.
    20. Wong, Tony Siu Tung & Li, Wai Keung, 2014. "Test for homogeneity in gamma mixture models using likelihood ratio," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 127-137.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ecosta:v:18:y:2021:i:c:p:79-88. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/econometrics-and-statistics .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.