Nonparametric Identification and Estimation of Multivariate Mixtures
AbstractWe study nonparametric identifiability of finite mixture models of k-variate data with M subpopulations, in which the components of the data vector are independent conditional on belonging to a subpopulation. We provide a sufficient condition for nonparametrically identifying M subpopulations when k>=3. Our focus is on the relationship between the number of values the components of the data vector can take on, and the number of identifiable subpopulations. Intuition would suggest that if the data vector can take many different values, then combining information from these different values helps identification. Hall and Zhou (2003) show, however, when k=2, two-component finite mixture models are not nonparametrically identifiable regardless of the number of the values the data vector can take. When k>=3, there emerges a link between the variation in the data vector, and the number of identifiable subpopulations: the number of identifiable subpopulations increases as the data vector takes on additional (different) values. This points to the possibility of identifying many components even when k=3, if the data vector has a continuously distributed element. Our identification method is constructive, and leads to an estimation strategy. It is not as efficient as the MLE, but can be used as the initial value of the optimization algorithm in computing the MLE. We also provide a sufficient condition for identifying the number of nonparametrically identifiable components, and develop a method for statistically testing and consistently estimating the number of nonparametrically identifiable components. We extend these procedures to develop a test for the number of components in binomial mixtures.
Download InfoIf you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
Bibliographic InfoPaper provided by Queen's University, Department of Economics in its series Working Papers with number 1153.
Length: 25 pages
Date of creation: Dec 2007
Date of revision:
finite mixture; binomial mixture; model selection; number of components; rank estimation;
Find related papers by JEL classification:
- C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
- C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
- C51 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Construction and Estimation
- C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
This paper has been announced in the following NEP Reports:
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Richard Paap & Frank Kleibergen, 2004.
"Generalized Reduced Rank Tests using the Singular Value Decomposition,"
Econometric Society 2004 Australasian Meetings
195, Econometric Society.
- Kleibergen, Frank & Paap, Richard, 2006. "Generalized reduced rank tests using the singular value decomposition," Journal of Econometrics, Elsevier, vol. 133(1), pages 97-126, July.
- Frank Kleibergen & Richard Paap, 2003. "Generalized Reduced Rank Tests using the Singular Value Decomposition," Tinbergen Institute Discussion Papers 03-003/4, Tinbergen Institute.
- Kleibergen, F.R. & Paap, R., 2003. "Generalized Reduced Rank Tests using the Singular Value Decomposition," Econometric Institute Research Papers EI 2003-01, Erasmus University Rotterdam, Erasmus School of Economics (ESE), Econometric Institute.
- T. P. Hettmansperger & Hoben Thomas, 2000. "Almost nonparametric inference for repeated measures in mixture models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(4), pages 811-825.
- Robin, J.M. & Smith, R.J., 1995.
"Tests of Rank,"
Cambridge Working Papers in Economics
9521, Faculty of Economics, University of Cambridge.
- Cragg, John G. & Donald, Stephen G., 1997. "Inferring the rank of a matrix," Journal of Econometrics, Elsevier, vol. 76(1-2), pages 223-250.
- W. Gibson, 1955. "An extension of Anderson's solution for the latent structure equations," Psychometrika, Springer, vol. 20(1), pages 69-73, March.
- Peter Hall & Amnon Neeman & Reza Pakyari & Ryan Elmore, 2005. "Nonparametric inference in multivariate mixtures," Biometrika, Biometrika Trust, Biometrika Trust, vol. 92(3), pages 667-678, September.
- T. Anderson, 1954. "On estimation of parameters in latent structure analysis," Psychometrika, Springer, vol. 19(1), pages 1-10, March.
- Dong, Yingying & Lewbel, Arthur, 2011.
"Nonparametric identification of a binary random factor in cross section data,"
Journal of Econometrics,
Elsevier, vol. 163(2), pages 163-171, August.
- Yingyong Dong & Arthur Lewbel, 2009. "Nonparametric identification of a binary random factor in cross section data," CeMMAP working papers, Centre for Microdata Methods and Practice, Institute for Fiscal Studies CWP16/09, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Yingying Dong & Arthur Lewbel, 2009. "Nonparametric Identification of a Binary Random Factor in Cross Section Data," Boston College Working Papers in Economics 707, Boston College Department of Economics, revised 01 Jul 2010.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Mark Babcock).
If references are entirely missing, you can add them using this form.