IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v188y2022ics0047259x21001408.html
   My bibliography  Save this article

Principal component analysis and clustering on manifolds

Author

Listed:
  • Mardia, Kanti V.
  • Wiechers, Henrik
  • Eltzner, Benjamin
  • Huckemann, Stephan F.

Abstract

Big data, high dimensional data, sparse data, large scale data, and imaging data are all becoming new frontiers of statistics. Changing technologies have created this flood and have led to a real hunger for new modeling strategies and data analysis by scientists. In many cases data are not Euclidean; for example, in molecular biology, the data sit on manifolds. Even in a simple non-Euclidean manifold (circle), to summarize angles by the arithmetic average cannot make sense and so more care is needed. Thus non-Euclidean settings throw up many major challenges, both mathematical and statistical. This paper will focus on the PCA and clustering methods for some manifolds. Of course, the PCA and clustering methods in multivariate analysis are one of the core topics.

Suggested Citation

  • Mardia, Kanti V. & Wiechers, Henrik & Eltzner, Benjamin & Huckemann, Stephan F., 2022. "Principal component analysis and clustering on manifolds," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
  • Handle: RePEc:eee:jmvana:v:188:y:2022:i:c:s0047259x21001408
    DOI: 10.1016/j.jmva.2021.104862
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X21001408
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2021.104862?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Michail Tsagris & Christina Beneki & Hossein Hassani, 2014. "On the Folded Normal Distribution," Mathematics, MDPI, vol. 2(1), pages 1-17, February.
    2. Stephan F. Huckemann, 2021. "Comments on: Recent advances in directional statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 71-75, March.
    3. Simon P. Preston & Andrew T. A. Wood, 2010. "Two‐Sample Bootstrap Hypothesis Tests for Three‐Dimensional Labelled Landmark Data," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 37(4), pages 568-587, December.
    4. Arthur Pewsey & Eduardo García-Portugués, 2021. "Rejoinder on: Recent advances in directional statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 76-82, March.
    5. Sungkyu Jung & Ian L. Dryden & J. S. Marron, 2012. "Analysis of principal nested spheres," Biometrika, Biometrika Trust, vol. 99(3), pages 551-568.
    6. Kanti V. Mardia, 2021. "Comments on: Recent advances in directional statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 59-63, March.
    7. J. Gower, 1975. "Generalized procrustes analysis," Psychometrika, Springer;The Psychometric Society, vol. 40(1), pages 33-51, March.
    8. Jes Frellsen & Ida Moltke & Martin Thiim & Kanti V Mardia & Jesper Ferkinghoff-Borg & Thomas Hamelryck, 2009. "A Probabilistic Model of RNA Conformational Space," PLOS Computational Biology, Public Library of Science, vol. 5(6), pages 1-11, June.
    9. Zhigang Yao & Zhenyue Zhang, 2020. "Principal Boundary on Riemannian Manifolds," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(531), pages 1435-1448, July.
    10. Arthur Pewsey & Eduardo García-Portugués, 2021. "Recent advances in directional statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 1-58, March.
    11. Kanti V. Mardia, 2013. "Statistical approaches to three key challenges in protein structural bioinformatics," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 62(3), pages 487-514, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kanti V. Mardia & Karthik Sriram, 2023. "Families of Discrete Circular Distributions with Some Novel Applications," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 85(1), pages 1-42, February.
    2. Fernández de Marcos Giménez de los Galanes, Alberto & García Portugués, Eduardo, 2022. "Data-driven stabilizations of goodness-of-fit tests," DES - Working Papers. Statistics and Econometrics. WS 35324, Universidad Carlos III de Madrid. Departamento de Estadística.
    3. Fernández-de-Marcos, Alberto & García-Portugués, Eduardo, 2023. "Data-driven stabilizations of goodness-of-fit tests," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    4. Andrew Harvey & Dario Palumbo, 2023. "Regime switching models for circular and linear time series," Journal of Time Series Analysis, Wiley Blackwell, vol. 44(4), pages 374-392, July.
    5. Jeon, Jeong Min & Van Keilegom, Ingrid, 2023. "Density estimation for mixed Euclidean and non-Euclidean data in the presence of measurement error," Journal of Multivariate Analysis, Elsevier, vol. 193(C).
    6. Andrade, Ana C.C. & Pereira, Gustavo H.A. & Artes, Rinaldo, 2023. "The circular quantile residual," Computational Statistics & Data Analysis, Elsevier, vol. 178(C).
    7. Meyners, Michael & Qannari, El Mostafa, 2001. "Relating principal component analysis on merged data sets to a regression approach," Technical Reports 2001,47, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    8. Wen Shi & Xi Chen & Jennifer Shang, 2019. "An Efficient Morris Method-Based Framework for Simulation Factor Screening," INFORMS Journal on Computing, INFORMS, vol. 31(4), pages 745-770, October.
    9. Juliana Martins Ruzante & Valerie J. Davidson & Julie Caswell & Aamir Fazil & John A. L. Cranfield & Spencer J. Henson & Sven M. Anders & Claudia Schmidt & Jeffrey M. Farber, 2010. "A Multifactorial Risk Prioritization Framework for Foodborne Pathogens," Risk Analysis, John Wiley & Sons, vol. 30(5), pages 724-742, May.
    10. Barbara McGillivray & Gard B. Jenset & Khalid Salama & Donna Schut, 2022. "Investigating patterns of change, stability, and interaction among scientific disciplines using embeddings," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-15, December.
    11. Wei Wang & Stephen J Lycett & Noreen von Cramon-Taubadel & Jennie J H Jin & Christopher J Bae, 2012. "Comparison of Handaxes from Bose Basin (China) and the Western Acheulean Indicates Convergence of Form, Not Cognitive Differences," PLOS ONE, Public Library of Science, vol. 7(4), pages 1-7, April.
    12. Kanti V. Mardia, 2021. "Comments on: Recent advances in directional statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 59-63, March.
    13. Lisa Sakamoto & Hiromi Kajiya-Kanegae & Koji Noshita & Hideki Takanashi & Masaaki Kobayashi & Toru Kudo & Kentaro Yano & Tsuyoshi Tokunaga & Nobuhiro Tsutsumi & Hiroyoshi Iwata, 2019. "Comparison of shape quantification methods for genomic prediction, and genome-wide association study of sorghum seed morphology," PLOS ONE, Public Library of Science, vol. 14(11), pages 1-15, November.
    14. M. Jones & Arthur Pewsey & Shogo Kato, 2015. "On a class of circulas: copulas for circular distributions," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(5), pages 843-862, October.
    15. Ibrahim, Muhammad Sohail & Dong, Wei & Yang, Qiang, 2020. "Machine learning driven smart electric power systems: Current trends and new perspectives," Applied Energy, Elsevier, vol. 272(C).
    16. John Gower & Garmt Dijksterhuis, 1994. "Multivariate analysis of coffee images: A study in the simultaneous display of multivariate quantitative and qualitative variables for several assessors," Quality & Quantity: International Journal of Methodology, Springer, vol. 28(2), pages 165-184, May.
    17. Lazar, Drew & Lin, Lizhen, 2017. "Scale and curvature effects in principal geodesic analysis," Journal of Multivariate Analysis, Elsevier, vol. 153(C), pages 64-82.
    18. Modroño Herrán, Juan Ignacio & Fernández Aguirre, María Carmen & Landaluce Calvo, M. Isabel, 2003. "Una propuesta para el análisis de tablas múltiples," BILTOKI 1134-8984, Universidad del País Vasco - Departamento de Economía Aplicada III (Econometría y Estadística).
    19. Peter Verboon & Willem Heiser, 1992. "Resistant orthogonal procrustes analysis," Journal of Classification, Springer;The Classification Society, vol. 9(2), pages 237-256, December.
    20. Jun Li & Wei Zhu & Jun Wang & Wenfei Li & Sheng Gong & Jian Zhang & Wei Wang, 2018. "RNA3DCNN: Local and global quality assessments of RNA 3D structures using 3D deep convolutional neural networks," PLOS Computational Biology, Public Library of Science, vol. 14(11), pages 1-18, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:188:y:2022:i:c:s0047259x21001408. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.