IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v99y2016icp81-90.html
   My bibliography  Save this article

General sparse multi-class linear discriminant analysis

Author

Listed:
  • Safo, Sandra E.
  • Ahn, Jeongyoun

Abstract

Discrimination with high dimensional data is often more effectively done with sparse methods that use a fraction of predictors rather than using all the available ones. In recent years, some effective sparse discrimination methods based on Fisher’s linear discriminant analysis (LDA) have been proposed for binary class problems. Extensions to multi-class problems are suggested in those works; however, they have some drawbacks such as the heavy computational cost for a large number of classes. We propose an approach to generalize a binary LDA solution into a multi-class solution while avoiding the limitations of the existing methods. Simulation studies with various settings, as well as real data examples including next generation sequencing data, confirm the effectiveness of the proposed approach.

Suggested Citation

  • Safo, Sandra E. & Ahn, Jeongyoun, 2016. "General sparse multi-class linear discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 99(C), pages 81-90.
  • Handle: RePEc:eee:csdana:v:99:y:2016:i:c:p:81-90
    DOI: 10.1016/j.csda.2016.01.011
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947316000207
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2016.01.011?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Brenton R. Graveley & Angela N. Brooks & Joseph W. Carlson & Michael O. Duff & Jane M. Landolin & Li Yang & Carlo G. Artieri & Marijke J. van Baren & Nathan Boley & Benjamin W. Booth & James B. Brown , 2011. "The developmental transcriptome of Drosophila melanogaster," Nature, Nature, vol. 471(7339), pages 473-479, March.
    2. Jeongyoun Ahn & J. S. Marron, 2010. "The maximal data piling direction for discrimination," Biometrika, Biometrika Trust, vol. 97(1), pages 254-259.
    3. Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
    4. Qing Mai & Hui Zou & Ming Yuan, 2012. "A direct approach to sparse discriminant analysis in ultra-high dimensions," Biometrika, Biometrika Trust, vol. 99(1), pages 29-42.
    5. Lee, Yoonkyung & Lin, Yi & Wahba, Grace, 2004. "Multicategory Support Vector Machines: Theory and Application to the Classification of Microarray Data and Satellite Radiance Data," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 67-81, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Li-Pang Chen & Grace Y. Yi & Qihuang Zhang & Wenqing He, 2019. "Multiclass analysis and prediction with network structured covariates," Journal of Statistical Distributions and Applications, Springer, vol. 6(1), pages 1-25, December.
    2. Elżbieta Szaruga & Elżbieta Skąpska & Elżbieta Załoga & Wiesław Matwiejczuk, 2018. "Trust and Distress Prediction in Modal Shift Potential of Long-Distance Road Freight in Containers: Modeling Approach in Transport Services for Sustainability," Sustainability, MDPI, vol. 10(7), pages 1-19, July.
    3. Michael Fop & Pierre-Alexandre Mattei & Charles Bouveyron & Thomas Brendan Murphy, 2022. "Unobserved classes and extra variables in high-dimensional discriminant analysis," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(1), pages 55-92, March.
    4. Hirose, Kei & Miura, Kanta & Koie, Atori, 2023. "Hierarchical clustered multiclass discriminant analysis via cross-validation," Computational Statistics & Data Analysis, Elsevier, vol. 178(C).
    5. Li-Pang Chen, 2022. "Network-Based Discriminant Analysis for Multiclassification," Journal of Classification, Springer;The Classification Society, vol. 39(3), pages 410-431, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ahn, Jeongyoun & Jeon, Yongho, 2015. "Sparse HDLSS discrimination with constrained data piling," Computational Statistics & Data Analysis, Elsevier, vol. 90(C), pages 74-83.
    2. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    3. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
    4. Yize Zhao & Matthias Chung & Brent A. Johnson & Carlos S. Moreno & Qi Long, 2016. "Hierarchical Feature Selection Incorporating Known and Novel Biological Information: Identifying Genomic Features Related to Prostate Cancer Recurrence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1427-1439, October.
    5. Francis X. Diebold & Kamil Yilmaz, 2016. "Trans-Atlantic Equity Volatility Connectedness: U.S. and European Financial Institutions, 2004–2014," Journal of Financial Econometrics, Oxford University Press, vol. 14(1), pages 81-127.
    6. Yoonkyung Lee, 2014. "Comments on: Support vector machines maximizing geometric margins for multi-class classification," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 22(3), pages 852-855, October.
    7. Jian Guo & Elizaveta Levina & George Michailidis & Ji Zhu, 2010. "Pairwise Variable Selection for High-Dimensional Model-Based Clustering," Biometrics, The International Biometric Society, vol. 66(3), pages 793-804, September.
    8. Franck Rapaport & Christina Leslie, 2010. "Determining Frequent Patterns of Copy Number Alterations in Cancer," PLOS ONE, Public Library of Science, vol. 5(8), pages 1-10, August.
    9. Lu Tang & Ling Zhou & Peter X. K. Song, 2019. "Fusion learning algorithm to combine partially heterogeneous Cox models," Computational Statistics, Springer, vol. 34(1), pages 395-414, March.
    10. Young‐Geun Choi & Lawrence P. Hanrahan & Derek Norton & Ying‐Qi Zhao, 2022. "Simultaneous spatial smoothing and outlier detection using penalized regression, with application to childhood obesity surveillance from electronic health records," Biometrics, The International Biometric Society, vol. 78(1), pages 324-336, March.
    11. Molly C. Klanderman & Kathryn B. Newhart & Tzahi Y. Cath & Amanda S. Hering, 2020. "Fault isolation for a complex decentralized waste water treatment facility," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 69(4), pages 931-951, August.
    12. Oda, Ryoya & Suzuki, Yuya & Yanagihara, Hirokazu & Fujikoshi, Yasunori, 2020. "A consistent variable selection method in high-dimensional canonical discriminant analysis," Journal of Multivariate Analysis, Elsevier, vol. 175(C).
    13. Wang, Li-Yu & Park, Cheolwoo & Yeon, Kyupil & Choi, Hosik, 2017. "Tracking concept drift using a constrained penalized regression combiner," Computational Statistics & Data Analysis, Elsevier, vol. 108(C), pages 52-69.
    14. Tomáš Plíhal, 2021. "Scheduled macroeconomic news announcements and Forex volatility forecasting," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 40(8), pages 1379-1397, December.
    15. Crystal T. Nguyen & Daniel J. Luckett & Anna R. Kahkoska & Grace E. Shearrer & Donna Spruijt‐Metz & Jaimie N. Davis & Michael R. Kosorok, 2020. "Estimating individualized treatment regimes from crossover designs," Biometrics, The International Biometric Society, vol. 76(3), pages 778-788, September.
    16. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    17. Jianqing Fan & Yang Feng & Jiancheng Jiang & Xin Tong, 2016. "Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 275-287, March.
    18. Murat Genç & M. Revan Özkale, 2021. "Usage of the GO estimator in high dimensional linear models," Computational Statistics, Springer, vol. 36(1), pages 217-239, March.
    19. Aytug, Haldun & Sayın, Serpil, 2012. "Exploring the trade-off between generalization and empirical errors in a one-norm SVM," European Journal of Operational Research, Elsevier, vol. 218(3), pages 667-675.
    20. Hwang, Youngdeok & Kim, Hang J. & Chang, Won & Yeo, Kyongmin & Kim, Yongku, 2019. "Bayesian pollution source identification via an inverse physics model," Computational Statistics & Data Analysis, Elsevier, vol. 134(C), pages 76-92.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:99:y:2016:i:c:p:81-90. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.