IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v83y2015icp223-235.html
   My bibliography  Save this article

Mixtures of common t-factor analyzers for modeling high-dimensional data with missing values

Author

Listed:
  • Wang, Wan-Lun

Abstract

Mixtures of common t-factor analyzers (MCtFA) have emerged as a sound parsimonious model-based tool for robust modeling of high-dimensional data in the presence of fat-tailed noises and atypical observations. This paper presents a generalization of MCtFA to accommodate missing values as they frequently occur in many scientific researches. Under a missing at random mechanism, a computationally efficient Expectation Conditional Maximization Either (ECME) algorithm is developed for parameter estimation. The techniques for visualization of the data, classification of new individuals, and imputation of missing values under an incomplete-data structure of MCtFA are also investigated. Illustrative examples concerning the analysis of real and simulated data sets are presented to describe the usefulness of the proposed methodology and compare the finite sample performance with its normal counterparts.

Suggested Citation

  • Wang, Wan-Lun, 2015. "Mixtures of common t-factor analyzers for modeling high-dimensional data with missing values," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 223-235.
  • Handle: RePEc:eee:csdana:v:83:y:2015:i:c:p:223-235
    DOI: 10.1016/j.csda.2014.10.007
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947314002990
    Download Restriction: Full text for ScienceDirect subscribers only.

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Cinzia Viroli, 2010. "Dimensionally Reduced Model-Based Clustering Through Mixtures of Factor Mixture Analyzers," Journal of Classification, Springer;The Classification Society, vol. 27(3), pages 363-388, November.
    2. Murray, Paula M. & Browne, Ryan P. & McNicholas, Paul D., 2014. "Mixtures of skew-t factor analyzers," Computational Statistics & Data Analysis, Elsevier, vol. 77(C), pages 326-335.
    3. Montanari, Angela & Viroli, Cinzia, 2011. "Maximum likelihood estimation of mixtures of factor analyzers," Computational Statistics & Data Analysis, Elsevier, vol. 55(9), pages 2712-2723, September.
    4. Kotz,Samuel & Nadarajah,Saralees, 2004. "Multivariate T-Distributions and Their Applications," Cambridge Books, Cambridge University Press, number 9780521826549.
    5. Boldea, Otilia & Magnus, Jan R., 2009. "Maximum Likelihood Estimation of the Multivariate Normal Mixture Model," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1539-1549.
    6. Wan-Lun Wang & Tsung-I Lin, 2013. "An efficient ECM algorithm for maximum likelihood estimation in mixtures of t-factor analyzers," Computational Statistics, Springer, vol. 28(2), pages 751-769, April.
    7. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    8. McLachlan, G. J. & Peel, D. & Bean, R. W., 2003. "Modelling high-dimensional data by mixtures of factor analyzers," Computational Statistics & Data Analysis, Elsevier, vol. 41(3-4), pages 379-388, January.
    9. Wang, Wan-Lun, 2013. "Mixtures of common factor analyzers for high-dimensional data with missing information," Journal of Multivariate Analysis, Elsevier, vol. 117(C), pages 120-133.
    10. Dankmar Böhning & Ekkehart Dietz & Rainer Schaub & Peter Schlattmann & Bruce Lindsay, 1994. "The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 46(2), pages 373-388, June.
    11. McLachlan, G.J. & Bean, R.W. & Ben-Tovim Jones, L., 2007. "Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution," Computational Statistics & Data Analysis, Elsevier, vol. 51(11), pages 5327-5338, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Scrucca, Luca, 2016. "Identifying connected components in Gaussian finite mixture models for clustering," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 5-17.
    2. García-Escudero, Luis Angel & Gordaliza, Alfonso & Greselin, Francesca & Ingrassia, Salvatore & Mayo-Iscar, Agustín, 2016. "The joint role of trimming and constraints in robust estimation for mixtures of Gaussian factor analyzers," Computational Statistics & Data Analysis, Elsevier, vol. 99(C), pages 131-147.
    3. repec:eee:jmvana:v:161:y:2017:i:c:p:157-171 is not listed on IDEAS
    4. Lin, Tsung-I & McLachlan, Geoffrey J. & Lee, Sharon X., 2016. "Extending mixtures of factor models using the restricted multivariate skew-normal distribution," Journal of Multivariate Analysis, Elsevier, vol. 143(C), pages 398-413.
    5. Wraith, Darren & Forbes, Florence, 2015. "Location and scale mixtures of Gaussians with flexible tail behaviour: Properties, inference and application to multivariate clustering," Computational Statistics & Data Analysis, Elsevier, vol. 90(C), pages 61-73.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:83:y:2015:i:c:p:223-235. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Dana Niculescu). General contact details of provider: http://www.elsevier.com/locate/csda .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.