IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v81y2002i1p67-84.html
   My bibliography  Save this article

Graph-Theoretic Procedures for Dimension Identification

Author

Listed:
  • Brito, María R.
  • Quiroz, Adolfo J.
  • Yukich, J. E.

Abstract

We consider the problem of identifying the dimension in which a sample of data points lives, when only their interpoint distances are known. We study as a random variable the average "reach" of vertices in the k-nearest-neighbors graph associated to the interpoint distance matrix, and we show how this variable can be used to accurately (from a probabilistic viewpoint) identify the unknown dimension at low computational cost. We discuss results that serve as the theoretical foundation for the methodology proposed. We illustrate how our method can help in dimension reduction procedures.

Suggested Citation

  • Brito, María R. & Quiroz, Adolfo J. & Yukich, J. E., 2002. "Graph-Theoretic Procedures for Dimension Identification," Journal of Multivariate Analysis, Elsevier, vol. 81(1), pages 67-84, April.
  • Handle: RePEc:eee:jmvana:v:81:y:2002:i:1:p:67-84
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047-259X(01)91992-X
    Download Restriction: Full text for ScienceDirect subscribers only
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Brito, M. R. & Chávez, E. L. & Quiroz, A. J. & Yukich, J. E., 1997. "Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection," Statistics & Probability Letters, Elsevier, vol. 35(1), pages 33-42, August.
    2. Wishik, S.M., 1978. "The use of incentives for fertility reduction," American Journal of Public Health, American Public Health Association, vol. 68(2), pages 113-114.
    3. J. Kruskal, 1964. "Nonmetric multidimensional scaling: A numerical method," Psychometrika, Springer;The Psychometric Society, vol. 29(2), pages 115-129, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. S. Camelo & M. González-Lima & A. Quiroz, 2015. "Nearest neighbors methods for support vector machines," Annals of Operations Research, Springer, vol. 235(1), pages 85-101, December.
    2. González-Barrios, José María & Quiroz, Adolfo J., 2003. "A clustering procedure based on the comparison between the k nearest neighbors graph and the minimal spanning tree," Statistics & Probability Letters, Elsevier, vol. 62(1), pages 23-34, March.
    3. Díaz, Mateo & Quiroz, Adolfo J. & Velasco, Mauricio, 2019. "Local angles and dimension estimation from data on manifolds," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 229-247.
    4. Brito, M.R. & Quiroz, A.J. & Yukich, J.E., 2013. "Intrinsic dimension identification via graph-theoretic methods," Journal of Multivariate Analysis, Elsevier, vol. 116(C), pages 263-277.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Samuel Shye, 2010. "The Motivation to Volunteer: A Systemic Quality of Life Theory," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 98(2), pages 183-200, September.
    2. Muñoz-Mas, Rafael & Vezza, Paolo & Alcaraz-Hernández, Juan Diego & Martínez-Capel, Francisco, 2016. "Risk of invasion predicted with support vector machines: A case study on northern pike (Esox Lucius, L.) and bleak (Alburnus alburnus, L.)," Ecological Modelling, Elsevier, vol. 342(C), pages 123-134.
    3. Karim Abou-Moustafa & Frank P. Ferrie, 2018. "Local generalized quadratic distance metrics: application to the k-nearest neighbors classifier," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(2), pages 341-363, June.
    4. Camacho, Maximo & Perez-Quiros, Gabriel & Saiz, Lorena, 2006. "Are European business cycles close enough to be just one?," Journal of Economic Dynamics and Control, Elsevier, vol. 30(9-10), pages 1687-1706.
    5. Nie, Chun-Xiao, 2020. "Correlation dynamics in the cryptocurrency market based on dimensionality reduction analysis," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 554(C).
    6. Mingxu Zhao & Nalaka Geekiyanage & Jianchu Xu & Myo Myo Khin & Dian Ridwan Nurdiana & Ekananda Paudel & Rhett Daniel Harrison, 2015. "Structure of the Epiphyte Community in a Tropical Montane Forest in SW China," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-19, April.
    7. Willem Heiser, 1991. "A generalized majorization method for least souares multidimensional scaling of pseudodistances that may be negative," Psychometrika, Springer;The Psychometric Society, vol. 56(1), pages 7-27, March.
    8. Luís Francisco Aguiar & Pedro C. Magalhães & Maria Joana Soares, 2010. "Synchronism in Electoral Cycles: How United are the United States?," NIPE Working Papers 17/2010, NIPE - Universidade do Minho.
    9. Kennen, Jonathan G. & Kauffman, Leon J. & Ayers, Mark A. & Wolock, David M. & Colarullo, Susan J., 2008. "Use of an integrated flow model to estimate ecologically relevant hydrologic characteristics at stream biomonitoring sites," Ecological Modelling, Elsevier, vol. 211(1), pages 57-76.
    10. José Luis Ortega Priego, 2003. "A Vector Space Model as a methodological approach to the Triple Helix dimensionality: A comparative study of Biology and Biomedicine Centres of two European National Research Councils from a Webometri," Scientometrics, Springer;Akadémiai Kiadó, vol. 58(2), pages 429-443, October.
    11. Jacques de Wet & Daniela Wetzelhütter & Johann Bacher, 2021. "Standardising the reproduction of Schwartz’s two-dimensional value space using multi-dimensional scaling and goodness-of-fit test procedures," Quality & Quantity: International Journal of Methodology, Springer, vol. 55(4), pages 1155-1179, August.
    12. Berrie Zielman & Willem Heiser, 1993. "Analysis of asymmetry by a slide-vector," Psychometrika, Springer;The Psychometric Society, vol. 58(1), pages 101-114, March.
    13. George Karabatsos, 2018. "On Bayesian Testing of Additive Conjoint Measurement Axioms Using Synthetic Likelihood," Psychometrika, Springer;The Psychometric Society, vol. 83(2), pages 321-332, June.
    14. Stephen Polasky & Andrew Solow & James Broadus, 1993. "Searching for uncertain benefits and the conservation of biological diversity," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 3(2), pages 171-181, April.
    15. Yoshio Takane & Justine Sergent, 1983. "Multidimensional scaling models for reaction times and same-different judgments," Psychometrika, Springer;The Psychometric Society, vol. 48(3), pages 393-423, September.
    16. Ahmed Shamsul Arefin & Carlos Riveros & Regina Berretta & Pablo Moscato, 2012. "GPU-FS-kNN: A Software Tool for Fast and Scalable kNN Computation Using GPUs," PLOS ONE, Public Library of Science, vol. 7(8), pages 1-13, August.
    17. Nie, Chun-Xiao, 2022. "Analysis of critical events in the correlation dynamics of cryptocurrency market," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 586(C).
    18. Richard Johnson, 1973. "Pairwise nonmetric multidimensional scaling," Psychometrika, Springer;The Psychometric Society, vol. 38(1), pages 11-18, March.
    19. Jerzy Grobelny & Rafal Michalski & Gerhard-Wilhelm Weber, 2021. "Modeling human thinking about similarities by neuromatrices in the perspective of fuzzy logic," WORking papers in Management Science (WORMS) WORMS/21/09, Department of Operations Research and Business Intelligence, Wroclaw University of Science and Technology.
    20. Patrick Groenen & Bart-Jan Os & Jacqueline Meulman, 2000. "Optimal scaling by alternating length-constrained nonnegative least squares, with application to distance-based analysis," Psychometrika, Springer;The Psychometric Society, vol. 65(4), pages 511-524, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:81:y:2002:i:1:p:67-84. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.