IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v157y2017icp87-102.html
   My bibliography  Save this article

Multiple correspondence analysis and the multilogit bilinear model

Author

Listed:
  • Fithian, William
  • Josse, Julie

Abstract

Multiple correspondence analysis is a dimension reduction technique which plays a large role in the analysis of tables with categorical nominal variables, such as survey data. Though it is usually motivated and derived using geometric considerations, we prove that in fact, it can be seen as a single proximal Newton step of a natural bilinear exponential family model for categorical data: the multinomial logit bilinear model. We compare and contrast the behavior of multiple correspondence analysis with that of this model on simulated data, and discuss new insights into both approaches and their cognate models. Consequently, multiple correspondence analysis can be used to approximate the parameters of the multilogit model. Indeed, estimating the model’s parameters is non-trivial, whereas multiple correspondence analysis has the advantage of being easily solved by a singular value decomposition, and scalable to large data sets. We illustrate the methods on a survey of the drinking habits in France in the context of European policies against the harmful effects of alcohol on society.

Suggested Citation

  • Fithian, William & Josse, Julie, 2017. "Multiple correspondence analysis and the multilogit bilinear model," Journal of Multivariate Analysis, Elsevier, vol. 157(C), pages 87-102.
  • Handle: RePEc:eee:jmvana:v:157:y:2017:i:c:p:87-102
    DOI: 10.1016/j.jmva.2017.02.009
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X1730115X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2017.02.009?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. de Leeuw, Jan, 2006. "Principal component analysis of binary data by iterated singular value decomposition," Computational Statistics & Data Analysis, Elsevier, vol. 50(1), pages 21-39, January.
    2. Henk Kiers, 1991. "Simple structure in component analysis techniques for mixtures of qualitative and quantitative variables," Psychometrika, Springer;The Psychometric Society, vol. 56(2), pages 197-212, June.
    3. Michel Tenenhaus & Forrest Young, 1985. "An analysis and synthesis of multiple correspondence analysis, optimal scaling, dual scaling, homogeneity analysis and other methods for quantifying categorical multivariate data," Psychometrika, Springer;The Psychometric Society, vol. 50(1), pages 91-119, March.
    4. Peter D. Hoff, 2009. "Multiplicative latent factor models for description and prediction of social networks," Computational and Mathematical Organization Theory, Springer, vol. 15(4), pages 261-272, December.
    5. Vartan Choulakian, 1996. "Generalized bilinear models," Psychometrika, Springer;The Psychometric Society, vol. 61(2), pages 271-283, June.
    6. Julie Josse & Marie Chavent & Benot Liquet & François Husson, 2012. "Handling Missing Values with Regularized Iterative Multiple Correspondence Analysis," Journal of Classification, Springer;The Classification Society, vol. 29(1), pages 91-116, April.
    7. Mark Rooij & Willem Heiser, 2005. "Graphical representations and odds ratios in a distance-association model for the analysis of cross-classified data," Psychometrika, Springer;The Psychometric Society, vol. 70(1), pages 99-122, March.
    8. Mark de Rooij, 2008. "The analysis of change, Newton's law of gravity and association models," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 171(1), pages 137-157, January.
    9. Genevera I. Allen & Logan Grosenick & Jonathan Taylor, 2014. "A Generalized Least-Square Matrix Decomposition," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 145-159, March.
    10. Husson, François & Josse, Julie & Saporta, Gilbert, 2016. "Jan de Leeuw and the French School of Data Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 73(i06).
    11. Peter Heijden & Jan Leeuw, 1985. "Correspondence analysis used complementary to loglinear analysis," Psychometrika, Springer;The Psychometric Society, vol. 50(4), pages 429-447, December.
    12. Irini Moustaki & Martin Knott, 2000. "Generalized latent trait models," Psychometrika, Springer;The Psychometric Society, vol. 65(3), pages 391-411, September.
    13. Jean‐Baptiste Denis & John C. Gower, 1996. "Asymptotic Confidence Regions for Biadditive Models: Interpreting Genotype‐Environment Interactions," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 45(4), pages 479-493, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Robin, Geneviève & Josse, Julie & Moulines, Éric & Sardy, Sylvain, 2019. "Low-rank model with covariates for count data with missing values," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 416-434.
    2. Xiaozi Liu & Henrik Lindhjem & Kristine Grimsrud & Einar Leknes & Endre Tvinnereim, 2023. "Is there a generational shift in preferences for forest carbon sequestration vs. preservation of agricultural landscapes?," Climatic Change, Springer, vol. 176(9), pages 1-22, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wang, Fa, 2017. "Maximum likelihood estimation and inference for high dimensional nonlinear factor models with application to factor-augmented regressions," MPRA Paper 93484, University Library of Munich, Germany, revised 19 May 2019.
    2. Wang, Fa, 2022. "Maximum likelihood estimation and inference for high dimensional generalized factor models with application to factor-augmented regressions," Journal of Econometrics, Elsevier, vol. 229(1), pages 180-200.
    3. Giuseppe Bove & Akinori Okada, 2018. "Methods for the analysis of asymmetric pairwise relationships," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(1), pages 5-31, March.
    4. Robin, Geneviève & Josse, Julie & Moulines, Éric & Sardy, Sylvain, 2019. "Low-rank model with covariates for count data with missing values," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 416-434.
    5. John C. Gower & Sugnet Gardner-Lubbe & Niel J. Le Roux, 2018. "Interaction: Fisher’s Optimal Scores Revisited," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(1), pages 92-112, March.
    6. Valero-Mora, Pedro M. & Young, Forrest W. & Friendly, Michael, 2003. "Visualizing categorical data in ViSta," Computational Statistics & Data Analysis, Elsevier, vol. 43(4), pages 495-508, August.
    7. Mariela González-Narváez & María José Fernández-Gómez & Susana Mendes & José-Luis Molina & Omar Ruiz-Barzola & Purificación Galindo-Villardón, 2021. "Study of Temporal Variations in Species–Environment Association through an Innovative Multivariate Method: MixSTATICO," Sustainability, MDPI, vol. 13(11), pages 1-25, May.
    8. Rosaria Lombardo & Yoshio Takane & Eric J. Beh, 2020. "Familywise decompositions of Pearson’s chi-square statistic in the analysis of contingency tables," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(3), pages 629-649, September.
    9. Domenico Piccolo & Rosaria Simone, 2019. "The class of cub models: statistical foundations, inferential issues and empirical evidence," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 28(3), pages 389-435, September.
    10. Henk Kiers, 1995. "Maximization of sums of quotients of quadratic forms and some generalizations," Psychometrika, Springer;The Psychometric Society, vol. 60(2), pages 221-245, June.
    11. Husson, François & Josse, Julie & Saporta, Gilbert, 2016. "Jan de Leeuw and the French School of Data Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 73(i06).
    12. Zhiqiu Hu & Rong-Cai Yang, 2013. "A New Distribution-Free Approach to Constructing the Confidence Region for Multiple Parameters," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-13, December.
    13. Yanyuan Ma & Marc G. Genton, 2010. "Explicit estimating equations for semiparametric generalized linear latent variable models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(4), pages 475-495, September.
    14. Emilio Augusto Coelho-Barros & Jorge Alberto Achcar & Josmar Mazucheli, 2010. "Longitudinal Poisson modeling: an application for CD4 counting in HIV-infected patients," Journal of Applied Statistics, Taylor & Francis Journals, vol. 37(5), pages 865-880.
    15. David B. Dunson & Sally D. Perreault, 2001. "Factor Analytic Models of Clustered Multivariate Data with Informative Censoring," Biometrics, The International Biometric Society, vol. 57(1), pages 302-308, March.
    16. Cai, Jing-Heng & Song, Xin-Yuan & Lam, Kwok-Hap & Ip, Edward Hak-Sing, 2011. "A mixture of generalized latent variable models for mixed mode and heterogeneous data," Computational Statistics & Data Analysis, Elsevier, vol. 55(11), pages 2889-2907, November.
    17. Johané Nienkemper-Swanepoel & Michael J Maltitz, 2017. "Investigating the Performance of a Variation of Multiple Correspondence Analysis for Multiple Imputation in Categorical Data Sets," Journal of Classification, Springer;The Classification Society, vol. 34(3), pages 384-398, October.
    18. Gardner, Sugnet & Gower, John C. & le Roux, N.J., 2006. "A synthesis of canonical variate analysis, generalised canonical correlation and Procrustes analysis," Computational Statistics & Data Analysis, Elsevier, vol. 50(1), pages 107-134, January.
    19. Vitoratou, Silia & Ntzoufras, Ioannis & Moustaki, Irini, 2016. "Explaining the behavior of joint and marginal Monte Carlo estimators in latent variable models with independence assumptions," LSE Research Online Documents on Economics 57685, London School of Economics and Political Science, LSE Library.
    20. Tsonaka, R. & Moustaki, I., 2007. "Parameter constraints in generalized linear latent variable models," Computational Statistics & Data Analysis, Elsevier, vol. 51(9), pages 4164-4177, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:157:y:2017:i:c:p:87-102. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.