IDEAS home Printed from https://ideas.repec.org/p/upf/upfgen/1278.html
   My bibliography  Save this paper

The contributions of rare objects in correspondence analysis

Author

Abstract

Correspondence analysis, when used to visualize relationships in a table of counts (for example, abundance data in ecology), has been frequently criticized as being too sensitive to objects (for example, species) that occur with very low frequency or in very few samples. In this statistical report we show that this criticism is generally unfounded. We demonstrate this in several data sets by calculating the actual contributions of rare objects to the results of correspondence analysis and canonical correspondence analysis, both to the determination of the principal axes and to the chi-square distance. It is a fact that rare objects are often positioned as outliers in correspondence analysis maps, which gives the impression that they are highly influential, but their low weight offsets their distant positions and reduces their effect on the results. An alternative scaling of the correspondence analysis solution, the contribution biplot, is proposed as a way of mapping the results in order to avoid the problem of outlying and low contributing rare objects.

Suggested Citation

  • Michael Greenacre, 2011. "The contributions of rare objects in correspondence analysis," Economics Working Papers 1278, Department of Economics and Business, Universitat Pompeu Fabra.
  • Handle: RePEc:upf:upfgen:1278
    as

    Download full text from publisher

    File URL: https://econ-papers.upf.edu/papers/1278.pdf
    File Function: Whole Paper
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Michael Greenacre, 2009. "Contribution biplots," Economics Working Papers 1162, Department of Economics and Business, Universitat Pompeu Fabra, revised Jan 2011.
    2. Michael Greenacre, 2008. "Correspondence analysis of raw data," Economics Working Papers 1112, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 2009.
    3. Nenadic, Oleg & Greenacre, Michael, 2007. "Correspondence Analysis in R, with Two- and Three-dimensional Graphics: The ca Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 20(i03).
    4. Eve Chiapello & A. Hurand, 2011. "Contribution," Post-Print hal-00681170, HAL.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Michael Greenacre, 2012. "Fuzzy coding in constrained ordinations," Economics Working Papers 1325, Department of Economics and Business, Universitat Pompeu Fabra.
    2. Greenacre, Michael, 2009. "Power transformations in correspondence analysis," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3107-3116, June.
    3. Michael Greenacre & Paul Lewi, 2009. "Distributional Equivalence and Subcompositional Coherence in the Analysis of Compositional Data, Contingency Tables and Ratio-Scale Measurements," Journal of Classification, Springer;The Classification Society, vol. 26(1), pages 29-54, April.
    4. Michael J. Greenacre & Patrick J. F. Groenen, 2016. "Weighted Euclidean Biplots," Journal of Classification, Springer;The Classification Society, vol. 33(3), pages 442-459, October.
    5. Beaton, Derek & Chin Fatt, Cherise R. & Abdi, Hervé, 2014. "An ExPosition of multivariate analysis with the singular value decomposition in R," Computational Statistics & Data Analysis, Elsevier, vol. 72(C), pages 176-189.
    6. Eric Beh & Luigi D’Ambra, 2009. "Some Interpretative Tools for Non-Symmetrical Correspondence Analysis," Journal of Classification, Springer;The Classification Society, vol. 26(1), pages 55-76, April.
    7. Pilar García Gómez & Ángel López Nicolás, 2005. "Socio-economic inequalities in health in Catalonia," Hacienda Pública Española / Review of Public Economics, IEF, vol. 175(4), pages 103-121, december.
    8. Herrera Gómez, Marcos & Ruiz Marín, Manuel & Mur Lacambra, Jesús, 2014. "Testing Spatial Causality in Cross-section Data," MPRA Paper 56678, University Library of Munich, Germany.
    9. Alfonso Gambardella & Walter Garcia Fontes, 1996. "European research funding and regional technological capabilities: Network composition analysis," Economics Working Papers 174, Department of Economics and Business, Universitat Pompeu Fabra.
    10. Memmel, Christoph & Sachs, Angelika, 2013. "Contagion in the interbank market and its determinants," Journal of Financial Stability, Elsevier, vol. 9(1), pages 46-54.
    11. Ferrari, Giorgio & Riedel, Frank & Steg, Jan-Henrik, 2016. "Continuous-Time Public Good Contribution under Uncertainty," Center for Mathematical Economics Working Papers 485, Center for Mathematical Economics, Bielefeld University.
    12. Córdoba, Juan Carlos & Ripoll, Marla, 2013. "What explains schooling differences across countries?," Journal of Monetary Economics, Elsevier, vol. 60(2), pages 184-202.
    13. Michael Greenacre, 2008. "Correspondence analysis of raw data," Economics Working Papers 1112, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 2009.
    14. Malcolm Dow & Peter Willett & Roderick McDonald & Belver Griffith & Michael Greenacre & Peter Bryant & Daniel Wartenberg & Ove Frank, 1987. "Book reviews," Journal of Classification, Springer;The Classification Society, vol. 4(2), pages 245-278, September.
    15. Altınok, Ahmet & Yılmaz, Murat, 2018. "Dynamic voluntary contribution to a public project under time inconsistency," Journal of Economic Behavior & Organization, Elsevier, vol. 145(C), pages 114-140.
    16. Billio, Monica & Getmansky, Mila & Lo, Andrew W. & Pelizzon, Loriana, 2012. "Econometric measures of connectedness and systemic risk in the finance and insurance sectors," Journal of Financial Economics, Elsevier, vol. 104(3), pages 535-559.
    17. Romaric Servajean-Hilst, 2013. "Stage of development, governance and performance of inter-firm innovation cooperation: a conceptual model and propositions," Post-Print hal-00805560, HAL.
    18. Jurlin, Kresimir & Malekovic, Sanja & Puljiz, Jaksa & Cziraky, Dario & Polic, Mario, 2002. "Covariance structure analysis of regional development data: an application to municipality development assessment," ERSA conference papers ersa02p469, European Regional Science Association.
    19. Hans van Kippersluis & Titus J. Galama, 2013. "Why the Rich drink more but smoke less: The Impact of Wealth on Health Behaviors," Tinbergen Institute Discussion Papers 13-035/V, Tinbergen Institute.
    20. Laurent Lesnard & Thibaut Saint Pol, 2009. "Patterns of Workweek Schedules in France," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 93(1), pages 171-176, August.

    More about this item

    Keywords

    Biplot; canonical correspondence analysis; contribution; correspondence analysis; influence; outlier; scaling;
    All these keywords.

    JEL classification:

    • C19 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Other
    • C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:upf:upfgen:1278. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: http://www.econ.upf.edu/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.