IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i16p2015-d620250.html
   My bibliography  Save this article

Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD

Author

Listed:
  • Jose Giovany Babativa-Márquez

    (Department of Statistics, University of Salamanca, 37008 Salamanca, Spain
    Facultad de Ciencias de la Salud y del Deporte, Fundación Universitaria del Área Andina, Bogotá 1321, Colombia)

  • José Luis Vicente-Villardón

    (Department of Statistics, University of Salamanca, 37008 Salamanca, Spain)

Abstract

Multivariate binary data are increasingly frequent in practice. Although some adaptations of principal component analysis are used to reduce dimensionality for this kind of data, none of them provide a simultaneous representation of rows and columns (biplot). Recently, a technique named logistic biplot (LB) has been developed to represent the rows and columns of a binary data matrix simultaneously, even though the algorithm used to fit the parameters is too computationally demanding to be useful in the presence of sparsity or when the matrix is large. We propose the fitting of an LB model using nonlinear conjugate gradient (CG) or majorization–minimization (MM) algorithms, and a cross-validation procedure is introduced to select the hyperparameter that represents the number of dimensions in the model. A Monte Carlo study that considers scenarios with several sparsity levels and different dimensions of the binary data set shows that the procedure based on cross-validation is successful in the selection of the model for all algorithms studied. The comparison of the running times shows that the CG algorithm is more efficient in the presence of sparsity and when the matrix is not very large, while the performance of the MM algorithm is better when the binary matrix is balanced or large. As a complement to the proposed methods and to give practical support, a package has been written in the R language called BiplotML. To complete the study, real binary data on gene expression methylation are used to illustrate the proposed methods.

Suggested Citation

  • Jose Giovany Babativa-Márquez & José Luis Vicente-Villardón, 2021. "Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD," Mathematics, MDPI, vol. 9(16), pages 1-19, August.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:16:p:2015-:d:620250
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/16/2015/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/16/2015/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Lee, Seokho & Huang, Jianhua Z., 2013. "A coordinate descent MM algorithm for fast computation of sparse logistic PCA," Computational Statistics & Data Analysis, Elsevier, vol. 62(C), pages 26-38.
    2. Luca Scrucca, 2014. "Graphical tools for model-based mixture discriminant analysis," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(2), pages 147-165, June.
    3. de Leeuw, Jan, 2006. "Principal component analysis of binary data by iterated singular value decomposition," Computational Statistics & Data Analysis, Elsevier, vol. 50(1), pages 21-39, January.
    4. Henk Kiers, 1997. "Weighted least squares fitting using ordinary least squares algorithms," Psychometrika, Springer;The Psychometric Society, vol. 62(2), pages 251-266, June.
    5. Patrick Groenen & Niël Roux & Sugnet Gardner-Lubbe, 2015. "Spline-based nonlinear biplots," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 9(2), pages 219-238, June.
    6. King, Gary & Zeng, Langche, 2001. "Logistic Regression in Rare Events Data," Political Analysis, Cambridge University Press, vol. 9(2), pages 137-163, January.
    7. Víctor Amor-Esteban & Mª-Purificación Galindo-Villardón & Isabel-María García-Sánchez, 2019. "A Multivariate Proposal for a National Corporate Social Responsibility Practices Index (NCSRPI) for International Settings," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 143(2), pages 525-560, June.
    8. Y.H. Dai & Y. Yuan, 2001. "An Efficient Hybrid Conjugate Gradient Method for Unconstrained Optimization," Annals of Operations Research, Springer, vol. 103(1), pages 33-47, March.
    9. Landgraf, Andrew J. & Lee, Yoonkyung, 2020. "Dimensionality reduction for binary data through the projection of natural parameters," Journal of Multivariate Analysis, Elsevier, vol. 180(C).
    10. Murray, D.M. & Varnell, S.P. & Blitstein, J.L., 2004. "Design and Analysis of Group-Randomized Trials: A Review of Recent Methodological Developments," American Journal of Public Health, American Public Health Association, vol. 94(3), pages 423-432.
    11. XiaoLiang Dong & Hongwei Liu & Yubo He, 2015. "A Self-Adjusting Conjugate Gradient Method with Sufficient Descent Condition and Conjugacy Condition," Journal of Optimization Theory and Applications, Springer, vol. 165(1), pages 225-241, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Laura Vicente-Gonzalez & Jose Luis Vicente-Villardon, 2022. "Partial Least Squares Regression for Binary Responses and Its Associated Biplot Representation," Mathematics, MDPI, vol. 10(15), pages 1-23, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Julio César Hernández-Sánchez & José Luis Vicente-Villardón, 2017. "Logistic biplot for nominal data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(2), pages 307-326, June.
    2. Jonathan L. Blitstein & David M. Murray & Peter J. Hannan & William R. Shadish, 2005. "Increasing the Degrees of Freedom in Future Group Randomized Trials," Evaluation Review, , vol. 29(3), pages 268-286, June.
    3. Angel M. Morales & Patrick Tarwater & Indika Mallawaarachchi & Alok Kumar Dwivedi & Juan B. Figueroa-Casas, 2015. "Multinomial logistic regression approach for the evaluation of binary diagnostic test in medical research," Statistics in Transition new series, Główny Urząd Statystyczny (Polska), vol. 16(2), pages 203-222, June.
    4. F. Gauthier & D. Germain & B. Hétu, 2017. "Logistic models as a forecasting tool for snow avalanches in a cold maritime climate: northern Gaspésie, Québec, Canada," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 89(1), pages 201-232, October.
    5. Douglas Cumming & Lars Hornuf & Moein Karami & Denis Schweizer, 2023. "Disentangling Crowdfunding from Fraudfunding," Journal of Business Ethics, Springer, vol. 182(4), pages 1103-1128, February.
    6. Qing Li & Long Hai Vo, 2021. "Intangible Capital and Innovation: An Empirical Analysis of Vietnamese Enterprises," Economics Discussion / Working Papers 21-02, The University of Western Australia, Department of Economics.
    7. Eunae Yoo & Elliot Rabinovich & Bin Gu, 2020. "The Growth of Follower Networks on Social Media Platforms for Humanitarian Operations," Production and Operations Management, Production and Operations Management Society, vol. 29(12), pages 2696-2715, December.
    8. Joost Ginkel & Pieter Kroonenberg, 2014. "Using Generalized Procrustes Analysis for Multiple Imputation in Principal Component Analysis," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 242-269, July.
    9. Cemal Eren Arbath & Quamral H. Ashraf & Oded Galor & Marc Klemp, 2018. "Diversity and Conflict," Working Papers 2018-6, Brown University, Department of Economics.
    10. Lo Turco, Alessia & Maggioni, Daniela, 2018. "Effects of Islamic religiosity on bilateral trust in trade: The case of Turkish exports," Journal of Comparative Economics, Elsevier, vol. 46(4), pages 947-965.
    11. Matija Kovacic & Claudio Zoli, 2021. "Ethnic distribution, effective power and conflict," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 57(2), pages 257-299, August.
    12. Blackman, Allen & Guerrero, Santiago, 2012. "What drives voluntary eco-certification in Mexico?," Journal of Comparative Economics, Elsevier, vol. 40(2), pages 256-268.
    13. Jacob Ausderan, 2018. "Reassessing the democratic advantage in interstate wars using k-adic datasets," Conflict Management and Peace Science, Peace Science Society (International), vol. 35(5), pages 451-473, September.
    14. Husson, François & Josse, Julie & Saporta, Gilbert, 2016. "Jan de Leeuw and the French School of Data Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 73(i06).
    15. Paul Poast, 2013. "Issue linkage and international cooperation: An empirical investigation," Conflict Management and Peace Science, Peace Science Society (International), vol. 30(3), pages 286-303, July.
    16. Yerko Rojas, 2017. "Evictions and short-term all-cause mortality: a 3-year follow-up study of a middle-aged Swedish population," International Journal of Public Health, Springer;Swiss School of Public Health (SSPH+), vol. 62(3), pages 343-351, April.
    17. Mehrez Ben Slama & Dhafer Saidane & Hassouna Fedhila, 2012. "How to identify targets in the M&A banking operations? Case of cross-border strategies in Europe by line of activity," Review of Quantitative Finance and Accounting, Springer, vol. 38(2), pages 209-240, February.
    18. Marcin Chlebus, 2014. "One-day prediction of state of turbulence for financial instrument based on models for binary dependent variable," Ekonomia journal, Faculty of Economic Sciences, University of Warsaw, vol. 37.
    19. Wang, Fa, 2017. "Maximum likelihood estimation and inference for high dimensional nonlinear factor models with application to factor-augmented regressions," MPRA Paper 93484, University Library of Munich, Germany, revised 19 May 2019.
    20. Lorenzo Cassi & Anne Plunket, 2014. "Proximity, network formation and inventive performance: in search of the proximity paradox," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 53(2), pages 395-422, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:16:p:2015-:d:620250. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.