IDEAS home Printed from https://ideas.repec.org/a/spr/metrik/v87y2024i5d10.1007_s00184-023-00919-z.html
   My bibliography  Save this article

A generalisation of the aggregate association index (AAI): incorporating a linear transformation of the cells of a 2 × 2 table

Author

Listed:
  • Eric J. Beh

    (University of Wollongong
    Stellenbosch University)

  • Duy Tran

    (Data Science Media, Nielsen)

  • Irene L. Hudson

    (Royal Melbourne Institute of Technology (RMIT))

Abstract

The analysis of aggregate, or marginal, data for contingency tables is an increasingly important area of statistics, applied sciences and the social sciences. This is largely due to confidentiality issues arising from the imposition of government and corporate protection and data collection methods. The availability of only aggregate data makes it difficult to draw conclusions about the association between categorical variables at the individual level. For data analysts, this issue is of growing concern, especially for those dealing with the aggregate analysis of a single 2 × 2 table or stratified 2 × 2 tables and lies in the field of ecological inference. As an alternative to ecological inference techniques, one may consider the aggregate association index (AAI) to obtain valuable information about the magnitude and direction of the association between two categorical variables of a single 2 × 2 table or stratified 2 × 2 tables given only the marginal totals. Conventionally, the AAI has been examined by considering $${\mathrm{p}}_{11}$$ p 11 —the proportion of the sample that lies in the (1, 1)th cell of a given 2 × 2 table. However, the AAI can be expanded for other association indices. Therefore, a new generalisation of the original AAI is given here by reformulating and expanding the index so that it incorporates any linear transformation of $${\mathrm{p}}_{11}$$ p 11 . This study shall consider the consistency of the AAI under the transformation by examining four classic association indices, namely the independence ratio, Pearson’s ratio, standardised residual and adjusted standardised residual, although others may be incorporated into this general framework. We will show how these indices can be utilised to examine the strength and direction of association given only the marginal totals. Therefore, this work enhances our understanding of the AAI and establishes its links with common association indices.

Suggested Citation

  • Eric J. Beh & Duy Tran & Irene L. Hudson, 2024. "A generalisation of the aggregate association index (AAI): incorporating a linear transformation of the cells of a 2 × 2 table," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 87(5), pages 1-33, July.
  • Handle: RePEc:spr:metrik:v:87:y:2024:i:5:d:10.1007_s00184-023-00919-z
    DOI: 10.1007/s00184-023-00919-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00184-023-00919-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00184-023-00919-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. King, Gary, 2004. "EI: A Program for Ecological Inference," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 11(i07).
    2. Carolina Plescia & Lorenzo De Sio, 2018. "An evaluation of the performance and suitability of R × C methods for ecological inference with known true values," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(2), pages 669-683, March.
    3. Denis Valle & James Clark, 2013. "Improving the Modeling of Disease Data from the Government Surveillance System: A Case Study on Malaria in the Brazilian Amazon," PLOS Computational Biology, Public Library of Science, vol. 9(11), pages 1-14, November.
    4. Greenacre, Michael, 2009. "Power transformations in correspondence analysis," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3107-3116, June.
    5. Matthijs Warrens, 2008. "On Association Coefficients for 2×2 Tables and Properties That Do Not Depend on the Marginal Distributions," Psychometrika, Springer;The Psychometric Society, vol. 73(4), pages 777-789, December.
    6. Ferree, Karen E., 2004. "Iterative Approaches to R × C Ecological Inference Problems: Where They Can Go Wrong and One Quick Fix," Political Analysis, Cambridge University Press, vol. 12(2), pages 143-159, April.
    7. R. Lombardo & E.J. Beh, 2016. "The prediction index of aggregate data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 43(11), pages 1998-2018, August.
    8. D. James Greiner & Kevin M. Quinn, 2009. "R×C ecological inference: bounds, correlations, flexibility and transparency of assumptions," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 172(1), pages 67-81, January.
    9. Young Mee Chung & Jae Yun Lee, 2001. "A corpus‐based approach to comparative evaluation of statistical term association measures," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 52(4), pages 283-296.
    10. Wei Xun & Aneire Khan & Edwin Michael & Paolo Vineis, 2010. "Climate change epidemiology: methodological challenges," International Journal of Public Health, Springer;Swiss School of Public Health (SSPH+), vol. 55(2), pages 85-96, April.
    11. Puig, Xavier & Ginebra, Josep, 2014. "A cluster analysis of vote transitions," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 328-344.
    12. R. L. Chambers & D. G. Steel, 2001. "Simple methods for ecological inference in 2×2 tables," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 164(1), pages 175-192.
    13. Irene L. Hudson & Linda Moore & Eric J. Beh & David G. Steel, 2010. "Ecological inference techniques: an empirical evaluation using data describing gender and voter turnout at New Zealand elections, 1893–1919," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 173(1), pages 185-213, January.
    14. Beh, Eric J., 2010. "The aggregate association index," Computational Statistics & Data Analysis, Elsevier, vol. 54(6), pages 1570-1580, June.
    15. Ori Rosen & Wenxin Jiang & Gary King & Martin A. Tanner, 2001. "Bayesian and Frequentist Inference for Ecological Inference: The R×C Case," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 55(2), pages 134-156, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Antonio Forcina & Davide Pellegrino, 2019. "Estimation of voter transitions and the ecological fallacy," Quality & Quantity: International Journal of Methodology, Springer, vol. 53(4), pages 1859-1874, July.
    2. Roberto Colombi & Antonio Forcina, 2016. "Latent class models for ecological inference on voters transitions," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 25(4), pages 501-517, November.
    3. Salman Cheema & Eric J. Beh & Irene L. Hudson, 2024. "How Informative Is the Marginal Information in a 2 × 2 Table for Assessing the Association Between Variables? The Aggregate Informative Index," Mathematics, MDPI, vol. 12(23), pages 1-15, November.
    4. Carolina Plescia & Lorenzo De Sio, 2018. "An evaluation of the performance and suitability of R × C methods for ecological inference with known true values," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(2), pages 669-683, March.
    5. Rob Eisinga, 2009. "The beta‐binomial convolution model for 2×2 tables with missing cell counts," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 63(1), pages 24-42, February.
    6. Joan G. Staniswalis, 2008. "Incorporating Marginal Covariate Information in a Nonparametric Regression Model for a Sample of R×C Tables," Biometrics, The International Biometric Society, vol. 64(4), pages 1054-1061, December.
    7. Zax Jeffrey S., 2012. "Single Regression Estimates of Voting Choices When Turnout is Unknown," Statistics, Politics and Policy, De Gruyter, vol. 4(1), pages 1-22, October.
    8. Olga Orlanski & Günther G. Schulze, 2017. "The Determinants of Islamophobia - An Empirical Analysis of the Swiss Minaret Referendum," CESifo Working Paper Series 6741, CESifo.
    9. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    10. Irene L. Hudson & Linda Moore & Eric J. Beh & David G. Steel, 2010. "Ecological inference techniques: an empirical evaluation using data describing gender and voter turnout at New Zealand elections, 1893–1919," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 173(1), pages 185-213, January.
    11. Puig, Xavier & Ginebra, Josep, 2014. "A cluster analysis of vote transitions," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 328-344.
    12. Pablo Sandoval & Silvia Ojeda, 2023. "Estimation of electoral volatility parameters employing ecological inference methods," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(1), pages 405-426, February.
    13. D. James Greiner & Kevin M. Quinn, 2009. "R×C ecological inference: bounds, correlations, flexibility and transparency of assumptions," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 172(1), pages 67-81, January.
    14. Jon Wakefield, 2004. "Ecological inference for 2 × 2 tables (with discussion)," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 167(3), pages 385-445, July.
    15. Matt Barreto & Loren Collingwood & Sergio Garcia-Rios & Kassra AR Oskooii, 2022. "Estimating Candidate Support in Voting Rights Act Cases: Comparing Iterative EI and EI-R×C Methods," Sociological Methods & Research, , vol. 51(1), pages 271-304, February.
    16. van Eck, N.J.P. & Waltman, L., 2009. "How to Normalize Co-Occurrence Data? An Analysis of Some Well-Known Similarity Measures," ERIM Report Series Research in Management ERS-2009-001-LIS, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
    17. Blasius, J. & Greenacre, M. & Groenen, P.J.F. & van de Velden, M., 2009. "Special issue on correspondence analysis and related methods," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3103-3106, June.
    18. de Bromhead, Alan & Fernihough, Alan & Hargaden, Enda, 2020. "Representation of the People: Franchise Extension and the “Sinn Féin Election” in Ireland, 1918," The Journal of Economic History, Cambridge University Press, vol. 80(3), pages 886-925, September.
    19. Ida Camminatiello & Antonello D’Ambra & Luigi D’Ambra, 2022. "The association in two-way ordinal contingency tables through global odds ratios," METRON, Springer;Sapienza Università di Roma, vol. 80(1), pages 9-22, April.
    20. A. Forcina & M. Gnaldi & B. Bracalente, 2012. "A revised Brown and Payne model of voting behaviour applied to the 2009 elections in Italy," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 21(1), pages 109-119, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metrik:v:87:y:2024:i:5:d:10.1007_s00184-023-00919-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.