IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/3394.html
   My bibliography  Save this paper

Correlation and regression in contingency tables. A measure of association or correlation in nominal data (contingency tables), using determinants

Author

Listed:
  • Colignatus, Thomas

Abstract

Nominal data currently lack a correlation coefficient, such as has already defined for real data. A measure is possible using the determinant, with the useful interpretation that the determinant gives the ratio between volumes. With M a m × n contingency table and n ≤ m the suggested measure is r = Sqrt[det[A'A]] with A = Normalized[M]. With M an n1 × n2 × ... × nk contingency matrix, we can construct a matrix of pairwise correlations R. A matrix of such pairwise correlations is called an association matrix. If that matrix is also positive semi-definite (PSD) then it is a proper correlation matrix. The overall correlation then is R = f[R] where f can be chosen to impose PSD-ness. An option is to use f[R] = Sqrt[1 - det[R]]. However, for both nominal and cardinal data the advisable choice is to take the maximal multiple correlation within R. The resulting measure of “nominal correlation” measures the distance between a main diagonal and the off-diagonal elements, and thus is a measure of strong correlation. Cramer’s V measure for pairwise correlation can be generalized in this manner too. It measures the distance between all diagonals (including cross-diagaonals and subdiagonals) and statistical independence, and thus is a measure of weaker correlation. Finally, when also variances are defined then regression coefficients can be determined from the variance-covariance matrix. The volume ratio measure can be related to the regression coefficients, not of the variables, but of the categories in the contingency matrix, using the conditional probabilities given the row and column sums.

Suggested Citation

  • Colignatus, Thomas, 2007. "Correlation and regression in contingency tables. A measure of association or correlation in nominal data (contingency tables), using determinants," MPRA Paper 3394, University Library of Munich, Germany, revised 07 Jun 2007.
  • Handle: RePEc:pra:mprapa:3394
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/3394/1/MPRA_paper_3394.pdf
    File Function: original version
    Download Restriction: no

    File URL: https://mpra.ub.uni-muenchen.de/3660/1/MPRA_paper_3660.pdf
    File Function: revised version
    Download Restriction: no

    References listed on IDEAS

    as
    1. Colignatus, Thomas, 2007. "A measure of association (correlation) in nominal data (contingency tables), using determinants," MPRA Paper 2662, University Library of Munich, Germany, revised 10 Apr 2007.
    2. Colignatus, Thomas, 2007. "A comparison of nominal regression and logistic regression for contingency tables, including the 2 × 2 × 2 case in causality," MPRA Paper 3615, University Library of Munich, Germany, revised 19 Jun 2007.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Colignatus, Thomas, 2017. "Comparing votes and seats with a diagonal (dis-) proportionality measure, using the slope-diagonal deviation (SDD) with cosine, sine and sign," MPRA Paper 80965, University Library of Munich, Germany, revised 24 Aug 2017.
    2. Colignatus, Thomas, 2007. "The 2 x 2 x 2 case in causality, of an effect, a cause and a confounder. A cross-over’s guide to the 2 x 2 x 2 contingency table," MPRA Paper 3351, University Library of Munich, Germany, revised 14 May 2007.
    3. Colignatus, Thomas, 2017. "Comparing votes and seats with a diagonal (dis-) proportionality measure, using the slope-diagonal deviation (SDD) with cosine, sine and sign," MPRA Paper 80833, University Library of Munich, Germany, revised 17 Aug 2017.
    4. Colignatus, Thomas, 2018. "An overview of the elementary statistics of correlation, R-squared, cosine, sine, and regression through the origin, with application to votes and seats for Parliament," MPRA Paper 84722, University Library of Munich, Germany, revised 20 Feb 2018.

    More about this item

    Keywords

    association; correlation; contingency table; volume ratio; determinant; nonparametric methods; nominal data; nominal scale; categorical data; Fisher’s exact test; odds ratio; tetrachoric correlation coefficient; phi; Cramer’s V; Pearson; contingency coefficient; uncertainty coefficient; Theil’s U; eta; meta-analysis; Simpson’s paradox; causality; statistical independence; regression;

    JEL classification:

    • C10 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:3394. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Joachim Winter). General contact details of provider: http://edirc.repec.org/data/vfmunde.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.