IDEAS home Printed from https://ideas.repec.org/a/sae/somere/v47y2018i2p207-239.html
   My bibliography  Save this article

On Visualizing Mixed-Type Data

Author

Listed:
  • Aurea Grané
  • Rosario Romera

Abstract

Survey data are usually of mixed type (quantitative, multistate categorical, and/or binary variables). Multidimensional scaling (MDS) is one of the most extended methodologies to visualize the profile structure of the data. Since the past 60s, MDS methods have been introduced in the literature, initially in publications in the psychometrics area. Nevertheless, sensitivity and robustness of MDS configurations have been topics scarcely addressed in the specialized literature. In this work, we are interested in the construction of robust profiles for mixed-type data using a proper MDS configuration. To this end, we propose to compare different MDS configurations (coming from different metrics) through a combination of sensitivity and robust analysis. In particular, as an alternative to classical Gower’s metric, we propose a robust joint metric combining different distance matrices, avoiding redundant information, via related metric scaling. The search for robustness and identification of outliers is done through a distance-based procedure related to geometric variability notions. In this sense, we propose a statistic for detecting multivariate outliers in the context of mixed-type data and evaluate its performance through a simulation study. Finally, we apply these techniques to a real data set provided by the largest humanitarian organization involved in social programs in Spain, where we are able to find in a robust way the most relevant factors defining the profiles of people that were under risk of being socially excluded in the beginning of the 2008 economic crisis.

Suggested Citation

  • Aurea Grané & Rosario Romera, 2018. "On Visualizing Mixed-Type Data," Sociological Methods & Research, , vol. 47(2), pages 207-239, March.
  • Handle: RePEc:sae:somere:v:47:y:2018:i:2:p:207-239
    DOI: 10.1177/0049124115621334
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.1177/0049124115621334
    Download Restriction: no

    File URL: https://libkey.io/10.1177/0049124115621334?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Marco Riani & Anthony C. Atkinson & Andrea Cerioli, 2009. "Finding an unknown number of multivariate outliers," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 447-466, April.
    2. Cuadras, C. M. & Fortiana, J., 1995. "A Continuous Metric Scaling Solution for a Random Variable," Journal of Multivariate Analysis, Elsevier, vol. 52(1), pages 1-14, January.
    3. J. Ramsay, 1980. "The joint analysis of direct ratings, pairwise preferences, and dissimilarities," Psychometrika, Springer;The Psychometric Society, vol. 45(2), pages 149-165, June.
    4. J. Carroll & Jih-Jie Chang, 1970. "Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition," Psychometrika, Springer;The Psychometric Society, vol. 35(3), pages 283-319, September.
    5. Roger Shepard, 1962. "The analysis of proximities: Multidimensional scaling with an unknown distance function. II," Psychometrika, Springer;The Psychometric Society, vol. 27(3), pages 219-246, September.
    6. Gale Young & A. Householder, 1938. "Discussion of a set of points in terms of their mutual distances," Psychometrika, Springer;The Psychometric Society, vol. 3(1), pages 19-22, March.
    7. W. J. Krzanowski, 2006. "Sensitivity in Metric Scaling and Analysis of Distance," Biometrics, The International Biometric Society, vol. 62(1), pages 239-244, March.
    8. J. Kruskal, 1964. "Nonmetric multidimensional scaling: A numerical method," Psychometrika, Springer;The Psychometric Society, vol. 29(2), pages 115-129, June.
    9. J. Kruskal, 1964. "Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis," Psychometrika, Springer;The Psychometric Society, vol. 29(1), pages 1-27, March.
    10. W. Krzanowski, 1994. "Ordination in the presence of group structure, for general multivariate data," Journal of Classification, Springer;The Classification Society, vol. 11(2), pages 195-207, September.
    11. Roger Shepard, 1962. "The analysis of proximities: Multidimensional scaling with an unknown distance function. I," Psychometrika, Springer;The Psychometric Society, vol. 27(2), pages 125-140, June.
    12. W. J. Krzanowski, 2004. "Biplots for Multifactorial Analysis of Distance," Biometrics, The International Biometric Society, vol. 60(2), pages 517-524, June.
    13. Jan Leeuw & Jacqueline Meulman, 1986. "A special Jackknife for Multidimensional Scaling," Journal of Classification, Springer;The Classification Society, vol. 3(1), pages 97-112, March.
    14. Jacqueline Meulman, 1992. "The integration of multidimensional scaling and multivariate analysis with optimal transformations," Psychometrika, Springer;The Psychometric Society, vol. 57(4), pages 539-565, December.
    15. Todorov, Valentin & Filzmoser, Peter, 2009. "An Object-Oriented Framework for Robust Multivariate Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 32(i03).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jacqueline Meulman & Peter Verboon, 1993. "Points of view analysis revisited: Fitting multidimensional structures to optimal distance components with cluster restrictions on the variables," Psychometrika, Springer;The Psychometric Society, vol. 58(1), pages 7-35, March.
    2. la Grange, Anthony & le Roux, Niël & Gardner-Lubbe, Sugnet, 2009. "BiplotGUI: Interactive Biplots in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 30(i12).
    3. Phipps Arabie, 1991. "Was euclid an unnecessarily sophisticated psychologist?," Psychometrika, Springer;The Psychometric Society, vol. 56(4), pages 567-587, December.
    4. J. Carroll, 1985. "Review," Psychometrika, Springer;The Psychometric Society, vol. 50(1), pages 133-140, March.
    5. W. J. Krzanowski, 2006. "Sensitivity in Metric Scaling and Analysis of Distance," Biometrics, The International Biometric Society, vol. 62(1), pages 239-244, March.
    6. Jacqueline Meulman, 1992. "The integration of multidimensional scaling and multivariate analysis with optimal transformations," Psychometrika, Springer;The Psychometric Society, vol. 57(4), pages 539-565, December.
    7. Groenen, P.J.F. & Borg, I., 2013. "The Past, Present, and Future of Multidimensional Scaling," Econometric Institute Research Papers EI 2013-07, Erasmus University Rotterdam, Erasmus School of Economics (ESE), Econometric Institute.
    8. Patrick Groenen & Willem Heiser, 1996. "The tunneling method for global optimization in multidimensional scaling," Psychometrika, Springer;The Psychometric Society, vol. 61(3), pages 529-550, September.
    9. Phipps Arabie & J. Carroll, 1980. "Mapclus: A mathematical programming approach to fitting the adclus model," Psychometrika, Springer;The Psychometric Society, vol. 45(2), pages 211-235, June.
    10. Jerzy Grobelny & Rafal Michalski & Gerhard-Wilhelm Weber, 2021. "Modeling human thinking about similarities by neuromatrices in the perspective of fuzzy logic," WORking papers in Management Science (WORMS) WORMS/21/09, Department of Operations Research and Business Intelligence, Wroclaw University of Science and Technology.
    11. Giovanni De Luca & Paola Zuccolotto, 2011. "A tail dependence-based dissimilarity measure for financial time series clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 5(4), pages 323-340, December.
    12. Hossein Safizadeh, M. & McKenna, David R., 1996. "Application of multidimensional scaling techniques to facilities layout," European Journal of Operational Research, Elsevier, vol. 92(1), pages 54-62, July.
    13. Verniest, Fabien & Greulich, Sabine, 2019. "Methods for assessing the effects of environmental parameters on biological communities in long-term ecological studies - A literature review," Ecological Modelling, Elsevier, vol. 414(C).
    14. Albert Maydeu-Olivares & Ishwar Sethi & Phipps Arabie & A. Tanguiane & K. Klauer & Pierre Hansen & Klaas Sijtsma & M. Windham, 1995. "Book reviews," Journal of Classification, Springer;The Classification Society, vol. 12(1), pages 137-158, March.
    15. Martin Young & Wayne DeSarbo, 1995. "A parametric procedure for ultrametric tree estimation from conditional rank order proximity data," Psychometrika, Springer;The Psychometric Society, vol. 60(1), pages 47-75, March.
    16. Abe, Makoto, 1998. "Error structure and identification condition in maximum likelihood nonmetric multidimensional scaling," European Journal of Operational Research, Elsevier, vol. 111(2), pages 216-227, December.
    17. Joseph Woelfel, 2020. "Convergences in cognitive science, social network analysis, pattern recognition and machine intelligence as dynamic processes in non-Euclidean space," Quality & Quantity: International Journal of Methodology, Springer, vol. 54(1), pages 263-278, February.
    18. repec:jss:jstsof:30:i12 is not listed on IDEAS
    19. Wayne DeSarbo & Ajay Manrai & Raymond Burke, 1990. "A nonspatial methodology for the analysis of two-way proximity data incorporating the distance-density hypothesis," Psychometrika, Springer;The Psychometric Society, vol. 55(2), pages 229-253, June.
    20. Roger Shepard, 1974. "Representation of structure in similarity data: Problems and prospects," Psychometrika, Springer;The Psychometric Society, vol. 39(4), pages 373-421, December.
    21. Venera Tomaselli, 1996. "Multivariate statistical techniques and sociological research," Quality & Quantity: International Journal of Methodology, Springer, vol. 30(3), pages 253-276, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:somere:v:47:y:2018:i:2:p:207-239. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.