The Contributions of Rare Objects in Correspondence Analysis
AbstractCorrespondence analysis, when used to visualize relationships in a table of counts (for example, abundance data in ecology), has been frequently criticized as being too sensitive to objects (for example, species) that occur with very low frequency or in very few samples. In this statistical report we show that this criticism is generally unfounded. We demonstrate this in several data sets by calculating the actual contributions of rare objects to the results of correspondence analysis and canonical correspondence analysis, both to the determination of the principal axes and to the chi-square distance. It is a fact that rare objects are often positioned as outliers in correspondence analysis maps, which gives the impression that they are highly influential, but their low weight offsets their distant positions and reduces their effect on the results. An alternative scaling of the correspondence analysis solution, the contribution biplot, is proposed as a way of mapping the results in order to avoid the problem of outlying and low contributing rare objects.
Download InfoIf you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
Bibliographic InfoPaper provided by Barcelona Graduate School of Economics in its series Working Papers with number 571.
Date of creation: Sep 2011
Date of revision:
Biplot; canonical correspondence analysis; contribution; correspondence analysis; influence; outlier; scaling;
Other versions of this item:
- Michael Greenacre, 2011. "The contributions of rare objects in correspondence analysis," Economics Working Papers 1278, Department of Economics and Business, Universitat Pompeu Fabra.
- C19 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Other
- C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software
This paper has been announced in the following NEP Reports:
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Michael Greenacre, 2008. "Correspondence analysis of raw data," Economics Working Papers 1112, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 2009.
- Oleg Nenadic & Michael Greenacre, . "Correspondence Analysis in R, with Two- and Three-dimensional Graphics: The ca Package," Journal of Statistical Software, American Statistical Association, vol. 20(i03).
- Michael Greenacre, 2009. "Contribution biplots," Economics Working Papers 1162, Department of Economics and Business, Universitat Pompeu Fabra, revised Jan 2011.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Bruno Guallar).
If references are entirely missing, you can add them using this form.