IDEAS home Printed from https://ideas.repec.org/a/taf/japsta/v38y2011i5p1021-1032.html
   My bibliography  Save this article

Classification with discrete and continuous variables via general mixed-data models

Author

Listed:
  • A. R. de Leon
  • A. Soo
  • T. Williamson

Abstract

We study the problem of classifying an individual into one of several populations based on mixed nominal, continuous, and ordinal data. Specifically, we obtain a classification procedure as an extension to the so-called location linear discriminant function, by specifying a general mixed-data model for the joint distribution of the mixed discrete and continuous variables. We outline methods for estimating misclassification error rates. Results of simulations of the performance of proposed classification rules in various settings vis-à-vis a robust mixed-data discrimination method are reported as well. We give an example utilizing data on croup in children.

Suggested Citation

  • A. R. de Leon & A. Soo & T. Williamson, 2011. "Classification with discrete and continuous variables via general mixed-data models," Journal of Applied Statistics, Taylor & Francis Journals, vol. 38(5), pages 1021-1032, February.
  • Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1021-1032
    DOI: 10.1080/02664761003758976
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/02664761003758976
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/02664761003758976?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wai-Yin Poon & Sik-Yum Lee, 1987. "Maximum likelihood estimation of multivariate polyserial and polychoric correlation coefficients," Psychometrika, Springer;The Psychometric Society, vol. 52(3), pages 409-430, September.
    2. Marian Núñez & Angel Villarroya & José María Oller, 2003. "Minimum Distance Probability Discriminant Analysis for Mixed Variables," Biometrics, The International Biometric Society, vol. 59(2), pages 248-253, June.
    3. Ming Tan & Yinsheng Qu & J. Sunil Rao, 1999. "Robustness of the Latent Variable Model for Correlated Binary Data," Biometrics, The International Biometric Society, vol. 55(1), pages 258-263, March.
    4. W. Krzanowski, 1993. "The location model for mixtures of categorical and continuous variables," Journal of Classification, Springer;The Classification Society, vol. 10(1), pages 25-49, January.
    5. de Leon, A. R. & Carrière, K. C., 2005. "A generalized Mahalanobis distance for mixed data," Journal of Multivariate Analysis, Elsevier, vol. 92(1), pages 174-185, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Leila Amiri & Mojtaba Khazaei & Mojtaba Ganjali, 2017. "General location model with factor analyzer covariance matrix structure and its applications," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(3), pages 593-609, September.
    2. Amparo Baíllo & Aurea Grané, 2021. "Subsampling and Aggregation: A Solution to the Scalability Problem in Distance-Based Prediction for Mixed-Type Data," Mathematics, MDPI, vol. 9(18), pages 1-17, September.
    3. Bhat, Chandra R., 2015. "A new generalized heterogeneous data model (GHDM) to jointly model mixed types of dependent variables," Transportation Research Part B: Methodological, Elsevier, vol. 79(C), pages 50-77.
    4. Miguel Angel Ortíz-Barrios & Matias Garcia-Constantino & Chris Nugent & Isaac Alfaro-Sarmiento, 2022. "A Novel Integration of IF-DEMATEL and TOPSIS for the Classifier Selection Problem in Assistive Technology Adoption for People with Dementia," IJERPH, MDPI, vol. 19(3), pages 1-31, January.
    5. Alban Mbina Mbina & Guy Martial Nkiet & Fulgence Eyi Obiang, 2019. "Variable selection in discriminant analysis for mixed continuous-binary variables and several groups," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(3), pages 773-795, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mortier, F. & Robin, S. & Lassalvy, S. & Baril, C.P. & Bar-Hen, A., 2006. "Prediction of Euclidean distances with discrete and continuous outcomes," Journal of Multivariate Analysis, Elsevier, vol. 97(8), pages 1799-1814, September.
    2. Layal Christine Lettry, 2023. "Clustering the Swiss Pension Register," FSES Working Papers 529, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    3. Paul S. Albert & Lisa M. McShane & Edward L. Korn, 2002. "Design of a Binary Biomarker Study from the Results of a Pilot Study," Biometrics, The International Biometric Society, vol. 58(3), pages 576-585, September.
    4. Nor Mahat & W.J. Krzanowski & A. Hernandez, 2009. "Strategies for Non-Parametric Smoothing of the Location Model in Mixed-Variable Discriminant Analysis," Modern Applied Science, Canadian Center of Science and Education, vol. 3(1), pages 151-151, January.
    5. Alban Mbina Mbina & Guy Martial Nkiet & Fulgence Eyi Obiang, 2019. "Variable selection in discriminant analysis for mixed continuous-binary variables and several groups," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(3), pages 773-795, September.
    6. Leila Amiri & Mojtaba Khazaei & Mojtaba Ganjali, 2018. "A mixture latent variable model for modeling mixed data in heterogeneous populations and its applications," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 102(1), pages 95-115, January.
    7. Chi-Ying Leung, 2001. "Error rates in classification consisting of discrete and continuous variables in the presence of covariates," Statistical Papers, Springer, vol. 42(2), pages 265-273, April.
    8. Poon, Wai-Yin & Hung, Hin-Yan, 1996. "Analysis of square tables with ordered categories," Computational Statistics & Data Analysis, Elsevier, vol. 22(3), pages 303-322, July.
    9. Merbouha, A. & Mkhadri, A., 2004. "Regularization of the location model in discrimination with mixed discrete and continuous variables," Computational Statistics & Data Analysis, Elsevier, vol. 45(3), pages 563-576, April.
    10. Wai Chan & Peter Bentler, 1998. "Covariance structure analysis of ordinal ipsative data," Psychometrika, Springer;The Psychometric Society, vol. 63(4), pages 369-399, December.
    11. Steffen Fieuws & Geert Verbeke, 2006. "Pairwise Fitting of Mixed Models for the Joint Modeling of Multivariate Longitudinal Profiles," Biometrics, The International Biometric Society, vol. 62(2), pages 424-431, June.
    12. Florian Schuberth & Jörg Henseler & Theo K. Dijkstra, 2018. "Partial least squares path modeling using ordinal categorical indicators," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(1), pages 9-35, January.
    13. Katsikatsou, Myrsini & Moustaki, Irini & Md Jamil, Haziq, 2022. "Pairwise likelihood estimation for confirmatory factor analysis models with categorical variables and data that are missing at random," LSE Research Online Documents on Economics 108933, London School of Economics and Political Science, LSE Library.
    14. Li, Zhengtao & Folmer, Henk & Xue, Jianhong, 2014. "To what extent does air pollution affect happiness? The case of the Jinchuan mining area, China," Ecological Economics, Elsevier, vol. 99(C), pages 88-99.
    15. Colin O. Wu & Gang Zheng & Minjung Kwak, 2013. "A Joint Regression Analysis for Genetic Association Studies with Outcome Stratified Samples," Biometrics, The International Biometric Society, vol. 69(2), pages 417-426, June.
    16. Sik-Yum Lee & Wai-Yin Poon & P. Bentler, 1989. "Simultaneous analysis of multivariate polytomous variates in several groups," Psychometrika, Springer;The Psychometric Society, vol. 54(1), pages 63-73, March.
    17. Leila Amiri & Mojtaba Khazaei & Mojtaba Ganjali, 2017. "General location model with factor analyzer covariance matrix structure and its applications," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(3), pages 593-609, September.
    18. de Leon, A.R., 2005. "Pairwise likelihood approach to grouped continuous model and its extension," Statistics & Probability Letters, Elsevier, vol. 75(1), pages 49-57, November.
    19. Piotr Tarka, 2018. "An overview of structural equation modeling: its beginnings, historical development, usefulness and controversies in the social sciences," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(1), pages 313-354, January.
    20. Hao Bai & Yuan Zhong & Xin Gao & Wei Xu, 2020. "Multivariate Mixed Response Model with Pairwise Composite-Likelihood Method," Stats, MDPI, vol. 3(3), pages 1-18, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1021-1032. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/CJAS20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.