IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v152y2020ics0167947320301419.html
   My bibliography  Save this article

Two new matrix-variate distributions with application in model-based clustering

Author

Listed:
  • Tomarchio, Salvatore D.
  • Punzo, Antonio
  • Bagnato, Luca

Abstract

Two matrix-variate distributions, both elliptical heavy-tailed generalization of the matrix-variate normal distribution, are introduced. They belong to the normal scale mixture family, and are respectively obtained by choosing a convenient shifted exponential or uniform as mixing distribution. Moreover, they have a closed-form for the probability density function that is characterized by only one additional parameter, with respect to the nested matrix-variate normal, governing the tail-weight. Both distributions are then used for model-based clustering via finite mixture models. The resulting mixtures, being able to handle data with atypical observations in a better way than the matrix-variate normal mixture, can avoid the disruption of the true underlying group structure. Different EM-based algorithms are implemented for parameter estimation and tested in terms of computational times and parameter recovery. Furthermore, these mixture models are fitted to simulated and real datasets, and their fitting and clustering performances are analyzed and compared to those obtained by other well-established competitors.

Suggested Citation

  • Tomarchio, Salvatore D. & Punzo, Antonio & Bagnato, Luca, 2020. "Two new matrix-variate distributions with application in model-based clustering," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).
  • Handle: RePEc:eee:csdana:v:152:y:2020:i:c:s0167947320301419
    DOI: 10.1016/j.csda.2020.107050
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947320301419
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2020.107050?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    2. Utkarsh J. Dang & Ryan P. Browne & Paul D. McNicholas, 2015. "Mixtures of multivariate power exponential distributions," Biometrics, The International Biometric Society, vol. 71(4), pages 1081-1089, December.
    3. Volodymyr Melnykov & Xuwen Zhu, 2019. "Studying crime trends in the USA over the years 2000–2012," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 325-341, March.
    4. Biernacki, Christophe & Celeux, Gilles & Govaert, Gerard, 2003. "Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 41(3-4), pages 561-575, January.
    5. Xiao‐Li Meng & David Van Dyk, 1997. "The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(3), pages 511-567.
    6. Sarkar, Shuchismita & Zhu, Xuwen & Melnykov, Volodymyr & Ingrassia, Salvatore, 2020. "On parsimonious models for modeling matrix data," Computational Statistics & Data Analysis, Elsevier, vol. 142(C).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Xuwen Zhu & Yana Melnykov, 2022. "On Finite Mixture Modeling of Change-point Processes," Journal of Classification, Springer;The Classification Society, vol. 39(1), pages 3-22, March.
    2. Alex Sharp & Glen Chalatov & Ryan P. Browne, 2023. "A dual subspace parsimonious mixture of matrix normal distributions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(3), pages 801-822, September.
    3. Federico Ferraccioli & Giovanna Menardi, 2023. "Modal clustering of matrix-variate data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(2), pages 323-345, June.
    4. Salvatore D. Tomarchio & Paul D. McNicholas & Antonio Punzo, 2021. "Matrix Normal Cluster-Weighted Models," Journal of Classification, Springer;The Classification Society, vol. 38(3), pages 556-575, October.
    5. Salvatore D. Tomarchio & Luca Bagnato & Antonio Punzo, 2022. "Model-based clustering via new parsimonious mixtures of heavy-tailed distributions," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 106(2), pages 315-347, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Utkarsh J. Dang & Michael P.B. Gallaugher & Ryan P. Browne & Paul D. McNicholas, 2023. "Model-Based Clustering and Classification Using Mixtures of Multivariate Skewed Power Exponential Distributions," Journal of Classification, Springer;The Classification Society, vol. 40(1), pages 145-167, April.
    2. Morris, Katherine & Punzo, Antonio & McNicholas, Paul D. & Browne, Ryan P., 2019. "Asymmetric clusters and outliers: Mixtures of multivariate contaminated shifted asymmetric Laplace distributions," Computational Statistics & Data Analysis, Elsevier, vol. 132(C), pages 145-166.
    3. Salvatore D. Tomarchio & Luca Bagnato & Antonio Punzo, 2022. "Model-based clustering via new parsimonious mixtures of heavy-tailed distributions," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 106(2), pages 315-347, June.
    4. Yang, Yu-Chen & Lin, Tsung-I & Castro, Luis M. & Wang, Wan-Lun, 2020. "Extending finite mixtures of t linear mixed-effects models with concomitant covariates," Computational Statistics & Data Analysis, Elsevier, vol. 148(C).
    5. Wan-Lun Wang & Tsung-I Lin, 2022. "Robust clustering of multiply censored data via mixtures of t factor analyzers," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 22-53, March.
    6. Wan-Lun Wang, 2019. "Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(1), pages 196-222, March.
    7. Zhu, Xuwen & Melnykov, Volodymyr, 2018. "Manly transformation in finite mixture modeling," Computational Statistics & Data Analysis, Elsevier, vol. 121(C), pages 190-208.
    8. Semhar Michael & Volodymyr Melnykov, 2016. "An effective strategy for initializing the EM algorithm in finite mixture models," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 10(4), pages 563-583, December.
    9. Hung Tong & Cristina Tortora, 2022. "Model-based clustering and outlier detection with missing data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(1), pages 5-30, March.
    10. Alessio Farcomeni & Antonio Punzo, 2020. "Robust model-based clustering with mild and gross outliers," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(4), pages 989-1007, December.
    11. Luca Scrucca & Adrian Raftery, 2015. "Improved initialisation of model-based clustering using Gaussian hierarchical partitions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 9(4), pages 447-460, December.
    12. Derek S. Young & Xi Chen & Dilrukshi C. Hewage & Ricardo Nilo-Poyanco, 2019. "Finite mixture-of-gamma distributions: estimation, inference, and model-based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(4), pages 1053-1082, December.
    13. Galimberti, Giuliano & Soffritti, Gabriele, 2014. "A multivariate linear regression analysis using finite mixtures of t distributions," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 138-150.
    14. Donatella Vicari & Paolo Giordani, 2023. "CPclus: Candecomp/Parafac Clustering Model for Three-Way Data," Journal of Classification, Springer;The Classification Society, vol. 40(2), pages 432-465, July.
    15. Salvatore D. Tomarchio & Paul D. McNicholas & Antonio Punzo, 2021. "Matrix Normal Cluster-Weighted Models," Journal of Classification, Springer;The Classification Society, vol. 38(3), pages 556-575, October.
    16. Kim, Nam-Hwui & Browne, Ryan P., 2021. "In the pursuit of sparseness: A new rank-preserving penalty for a finite mixture of factor analyzers," Computational Statistics & Data Analysis, Elsevier, vol. 160(C).
    17. Morris, Katherine & McNicholas, Paul D., 2016. "Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures," Computational Statistics & Data Analysis, Elsevier, vol. 97(C), pages 133-150.
    18. Počuča, Nikola & Jevtić, Petar & McNicholas, Paul D. & Miljkovic, Tatjana, 2020. "Modeling frequency and severity of claims with the zero-inflated generalized cluster-weighted models," Insurance: Mathematics and Economics, Elsevier, vol. 94(C), pages 79-93.
    19. Hasnat, Md. Abul & Velcin, Julien & Bonnevay, Stephane & Jacques, Julien, 2017. "Evolutionary clustering for categorical data using parametric links among multinomial mixture models," Econometrics and Statistics, Elsevier, vol. 3(C), pages 141-159.
    20. Gabriele Perrone & Gabriele Soffritti, 2023. "Seemingly unrelated clusterwise linear regression for contaminated data," Statistical Papers, Springer, vol. 64(3), pages 883-921, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:152:y:2020:i:c:s0167947320301419. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.