A merging algorithm for Gaussian mixture components
In finite mixture model clustering, each component of the fitted mixture is usually associated with a cluster. In other words, each component of the mixture is interpreted as the probability distribution of the variables of interest conditionally on the membership to a given cluster. The Gaussian mixture model (GMM) is very popular in this context for its simplicity and flexibility. It may happen, however, that the components of the fitted model are not well separated. In such a circumstance, the number of clusters is often overestimated and a better clustering could be obtained by joining some subsets of the partition based on the fitted GMM. Some methods for the aggregation of mixture components have been recently proposed in the literature. In this work, we propose a hierarchical aggregation algorithm based on a generalisation of the definition of silhouette-width taking into account the Mahalanobis distances induced by the precison matrices of the components of the fitted GMM. The algorithm chooses the number of groups corresponding to the hierarchy level giving rise to the highest average-silhouette-width. Some simulation experiments and real data applications indicate that its performance is at least as good as the one of other existing methods.
|Date of creation:||2013|
|Date of revision:|
|Contact details of provider:|| Postal: Cannaregio, S. Giobbe no 873 , 30121 Venezia|
Web page: http://www.unive.it/dip.economia
More information through EDIRC
When requesting a correction, please mention this item's handle: RePEc:ven:wpaper:2013:04. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Geraldine Ludbrook)
If references are entirely missing, you can add them using this form.