IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v92y2005i1p174-185.html
   My bibliography  Save this article

A generalized Mahalanobis distance for mixed data

Author

Listed:
  • de Leon, A. R.
  • Carrière, K. C.

Abstract

A distance for mixed nominal, ordinal and continuous data is developed by applying the Kullback-Leibler divergence to the general mixed-data model, an extension of the general location model that allows for ordinal variables to be incorporated in the model. The distance obtained can be considered as a generalization of the Mahalanobis distance to data with a mixture of nominal, ordinal and continuous variables. Moreover, it includes as special cases previous Mahalanobis-type distances developed by Bedrick et al. (Biometrics 56 (2000) 394) and Bar-Hen and Daudin (J. Multivariate Anal. 53 (1995) 332). Asymptotic results regarding the maximum likelihood estimator of the distance are discussed. The results of a simulation study on the level and power of the tests are reported and a real-data example illustrates the method.

Suggested Citation

  • de Leon, A. R. & Carrière, K. C., 2005. "A generalized Mahalanobis distance for mixed data," Journal of Multivariate Analysis, Elsevier, vol. 92(1), pages 174-185, January.
  • Handle: RePEc:eee:jmvana:v:92:y:2005:i:1:p:174-185
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047-259X(03)00150-7
    Download Restriction: Full text for ScienceDirect subscribers only
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Barhen, A. & Daudin, J. J., 1995. "Generalization of the Mahalanobis Distance in the Mixed Case," Journal of Multivariate Analysis, Elsevier, vol. 53(2), pages 332-342, May.
    2. W. Krzanowski, 1984. "On the null distribution of distance between two groups, using mixed continuous and categorical variables," Journal of Classification, Springer;The Classification Society, vol. 1(1), pages 243-253, December.
    3. Edward J. Bedrick & Jodi Lapidus & Joseph F. Powell, 2000. "Estimating the Mahalanobis Distance from Mixed Continuous and Discrete Data," Biometrics, The International Biometric Society, vol. 56(2), pages 394-401, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Cheng, Tsung-Chi & Biswas, Atanu, 2008. "Maximum trimmed likelihood estimator for multivariate mixed continuous and categorical data," Computational Statistics & Data Analysis, Elsevier, vol. 52(4), pages 2042-2065, January.
    2. Alban Mbina Mbina & Guy Martial Nkiet & Fulgence Eyi Obiang, 2019. "Variable selection in discriminant analysis for mixed continuous-binary variables and several groups," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(3), pages 773-795, September.
    3. Mortier, F. & Robin, S. & Lassalvy, S. & Baril, C.P. & Bar-Hen, A., 2006. "Prediction of Euclidean distances with discrete and continuous outcomes," Journal of Multivariate Analysis, Elsevier, vol. 97(8), pages 1799-1814, September.
    4. A. R. de Leon & A. Soo & T. Williamson, 2011. "Classification with discrete and continuous variables via general mixed-data models," Journal of Applied Statistics, Taylor & Francis Journals, vol. 38(5), pages 1021-1032, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alban Mbina Mbina & Guy Martial Nkiet & Fulgence Eyi Obiang, 2019. "Variable selection in discriminant analysis for mixed continuous-binary variables and several groups," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(3), pages 773-795, September.
    2. Merbouha, A. & Mkhadri, A., 2004. "Regularization of the location model in discrimination with mixed discrete and continuous variables," Computational Statistics & Data Analysis, Elsevier, vol. 45(3), pages 563-576, April.
    3. Cheng, Tsung-Chi & Biswas, Atanu, 2008. "Maximum trimmed likelihood estimator for multivariate mixed continuous and categorical data," Computational Statistics & Data Analysis, Elsevier, vol. 52(4), pages 2042-2065, January.
    4. Daudin, J. J. & Bar-Hen, A., 1999. "Selection in discriminant analysis with continuous and discrete variables," Computational Statistics & Data Analysis, Elsevier, vol. 32(2), pages 161-175, December.
    5. Mortier, F. & Robin, S. & Lassalvy, S. & Baril, C.P. & Bar-Hen, A., 2006. "Prediction of Euclidean distances with discrete and continuous outcomes," Journal of Multivariate Analysis, Elsevier, vol. 97(8), pages 1799-1814, September.
    6. Tang, John P., 2015. "Pollution havens and the trade in toxic chemicals: Evidence from U.S. trade flows," Ecological Economics, Elsevier, vol. 112(C), pages 150-160.
    7. Pierrette Chagneau & Frédéric Mortier & Nicolas Picard & Jean-Noël Bacro, 2011. "A Hierarchical Bayesian Model for Spatial Prediction of Multivariate Non-Gaussian Random Fields," Biometrics, The International Biometric Society, vol. 67(1), pages 97-105, March.
    8. de Leon, A.R., 2005. "Pairwise likelihood approach to grouped continuous model and its extension," Statistics & Probability Letters, Elsevier, vol. 75(1), pages 49-57, November.
    9. Chaubert, F. & Mortier, F. & Saint André, L., 2008. "Multivariate dynamic model for ordinal outcomes," Journal of Multivariate Analysis, Elsevier, vol. 99(8), pages 1717-1732, September.
    10. de Leon, A.R. & Zhu, Y., 2008. "ANOVA extensions for mixed discrete and continuous data," Computational Statistics & Data Analysis, Elsevier, vol. 52(4), pages 2218-2227, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:92:y:2005:i:1:p:174-185. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.