IDEAS home Printed from https://ideas.repec.org/a/spr/advdac/v19y2025i1d10.1007_s11634-023-00576-0.html
   My bibliography  Save this article

QDA classification of high-dimensional data with rare and weak signals

Author

Listed:
  • Hanning Chen

    (University of Melbourne)

  • Qiang Zhao

    (Shandong Normal University)

  • Jingjing Wu

    (University of Calgary)

Abstract

This paper addresses the two-class classification problem for data with rare and weak signals, under the modern high-dimension setup $$p>>n$$ p > > n . Considering the two-component mixture of Gaussian features with different random mean vector of rare and weak signals but common covariance matrix (homoscedastic Gaussian), Fan (AS 41:2537-2571, 2013) investigated the optimality of linear discriminant analysis (LDA) and proposed an efficient variable selection and classification procedure. We extend their work by incorporating the more general scenario that the two components have different random covariance matrices with difference of rare and weak signals, in order to assess the effect of difference in covariance matrix on classification. Under this model, we investigated the behaviour of quadratic discriminant analysis (QDA) classifier. In theoretical aspect, we derived the successful and unsuccessful classification regions of QDA. For data of rare signals, variable selection will mostly improve the performance of statistical procedures. Thus in implementation aspect, we proposed a variable selection procedure for QDA based on the Higher Criticism Thresholding (HCT) that was proved efficient for LDA. In addition, we conducted extensive simulation studies to demonstrate the successful and unsuccessful classification regions of QDA and evaluate the effectiveness of the proposed HCT thresholded QDA.

Suggested Citation

  • Hanning Chen & Qiang Zhao & Jingjing Wu, 2025. "QDA classification of high-dimensional data with rare and weak signals," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 19(1), pages 31-65, March.
  • Handle: RePEc:spr:advdac:v:19:y:2025:i:1:d:10.1007_s11634-023-00576-0
    DOI: 10.1007/s11634-023-00576-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11634-023-00576-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11634-023-00576-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sihai Dave Zhao & T. Tony Cai & Hongzhe Li, 2014. "Direct estimation of differential networks," Biometrika, Biometrika Trust, vol. 101(2), pages 253-268.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhou Tang & Zhangsheng Yu & Cheng Wang, 2020. "A fast iterative algorithm for high-dimensional differential network," Computational Statistics, Springer, vol. 35(1), pages 95-109, March.
    2. Napoli, Philip M., 2015. "Social media and the public interest: Governance of news platforms in the realm of individual and algorithmic gatekeepers," Telecommunications Policy, Elsevier, vol. 39(9), pages 751-760.
    3. Marinela - Daniela Manea, 2016. "Corporate Social Responsibility between the Aim and the Reality of Implementation in the Romanian Companies," Risk in Contemporary Economy, "Dunarea de Jos" University of Galati, Faculty of Economics and Business Administration, pages 335-340.
    4. Pan, Yuqing & Mai, Qing, 2020. "Efficient computation for differential network analysis with applications to quadratic discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    5. Wenqin Du & Bailey K. Fosdick & Wen Zhou, 2025. "Regression Modeling of the Count Relational Data with Exchangeable Dependencies," Papers 2502.11255, arXiv.org.
    6. Zhang, Hongmei & Huang, Xianzheng & Han, Shengtong & Rezwan, Faisal I. & Karmaus, Wilfried & Arshad, Hasan & Holloway, John W., 2021. "Gaussian Bayesian network comparisons with graph ordering unknown," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    7. Wessel N. van Wieringen & Carel F. W. Peeters & Renee X. de Menezes & Mark A. van de Wiel, 2018. "Testing for pathway (in)activation by using Gaussian graphical models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 67(5), pages 1419-1436, November.
    8. Byol Kim & Song Liu & Mladen Kolar, 2021. "Two‐sample inference for high‐dimensional Markov networks," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 939-962, November.
    9. Djordjilović, Vera & Chiogna, Monica, 2022. "Searching for a source of difference in graphical models," Journal of Multivariate Analysis, Elsevier, vol. 190(C).
    10. Deepak Nag Ayyala & Santu Ghosh & Daniel F. Linder, 2022. "Covariance matrix testing in high dimension using random projections," Computational Statistics, Springer, vol. 37(3), pages 1111-1141, July.
    11. Aaron Hudson & Ali Shojaie, 2022. "Covariate-Adjusted Inference for Differential Analysis of High-Dimensional Networks," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 84(1), pages 345-388, June.
    12. Pircalabelu, Eugen, 2022. "WB-graphs: a within versus between group similarity interplay," LIDAM Discussion Papers ISBA 2022007, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    13. Huang, Xianzheng & Zhang, Hongmei, 2021. "Tests for differential Gaussian Bayesian networks based on quadratic inference functions," Computational Statistics & Data Analysis, Elsevier, vol. 159(C).
    14. Pedro Galeano & Daniel Peña, 2019. "Data science, big data and statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(2), pages 289-329, June.
    15. Jiadong Ji & Yong He & Lei Liu & Lei Xie, 2021. "Brain connectivity alteration detection via matrix‐variate differential network model," Biometrics, The International Biometric Society, vol. 77(4), pages 1409-1421, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:advdac:v:19:y:2025:i:1:d:10.1007_s11634-023-00576-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.