IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v39y2024i3d10.1007_s00180-023-01374-0.html
   My bibliography  Save this article

Two-sample mean vector projection test in high-dimensional data

Author

Listed:
  • Caizhu Huang

    (University of Padova)

  • Xia Cui

    (Guangzhou University)

  • Euloge Clovis Kenne Pagui

    (University of Oslo)

Abstract

Statistical hypothesis testing for high-dimensional data poses challenges in recent inference. The Hotelling test is commonly applied to the comparison of mean vectors with a fixed dimension of mean vectors, but becomes unavailable when the dimension diverges or greater than sample sizes. For the high-dimensional regimes, we propose a two-sample mean vector test statistic by adding a projection term based on the Euclidean norm of the mean vectors. The projection term improves power and ensures the validity for dimensions greater than sample sizes, without relying on any inverse matrices. The proposed projection statistic, suitably standardized, approximates a standard normal distribution under mild conditions. Extensive simulation results, under different scenarios, show that the proposed approach enjoys a comparable Type I error and an improved efficiency power. We further illustrate its application by testing the equality of two acute lymphocytic leukemia genetic data and a significant test of the “Sell in May and Go Away” effect in China A stock market.

Suggested Citation

  • Caizhu Huang & Xia Cui & Euloge Clovis Kenne Pagui, 2024. "Two-sample mean vector projection test in high-dimensional data," Computational Statistics, Springer, vol. 39(3), pages 1061-1091, May.
  • Handle: RePEc:spr:compst:v:39:y:2024:i:3:d:10.1007_s00180-023-01374-0
    DOI: 10.1007/s00180-023-01374-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-023-01374-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-023-01374-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lan Wang & Bo Peng & Runze Li, 2015. "A High-Dimensional Nonparametric Multivariate Test for Mean Vector," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1658-1669, December.
    2. Chen, Song Xi & Qin, Yingli, 2010. "A Two Sample Test for High Dimensional Data with Applications to Gene-set Testing," MPRA Paper 59642, University Library of Munich, Germany.
    3. Deepak Nag Ayyala & Santu Ghosh & Daniel F. Linder, 2022. "Covariance matrix testing in high dimension using random projections," Computational Statistics, Springer, vol. 37(3), pages 1111-1141, July.
    4. Tianming Zhu & Jin-Ting Zhang, 2022. "Linear hypothesis testing in high-dimensional one-way MANOVA: a new normal reference approach," Computational Statistics, Springer, vol. 37(1), pages 1-27, March.
    5. Xia Cui & Runze Li & Guangren Yang & Wang Zhou, 2020. "Empirical likelihood test for a large-dimensional mean vector," Biometrika, Biometrika Trust, vol. 107(3), pages 591-607.
    6. Srivastava, Muni S. & Du, Meng, 2008. "A test for the mean vector with fewer observations than the dimension," Journal of Multivariate Analysis, Elsevier, vol. 99(3), pages 386-402, March.
    7. Takayuki Yamada & Tetsuto Himeno, 2019. "Estimation of multivariate 3rd moment for high-dimensional data and its application for testing multivariate normality," Computational Statistics, Springer, vol. 34(2), pages 911-941, June.
    8. Jin-Ting Zhang & Jia Guo & Bu Zhou & Ming-Yen Cheng, 2020. "A Simple Two-Sample Test in High Dimensions Based on L2-Norm," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(530), pages 1011-1027, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jianghao Li & Shizhe Hong & Zhenzhen Niu & Zhidong Bai, 2025. "Test for high-dimensional linear hypothesis of mean vectors via random integration," Statistical Papers, Springer, vol. 66(1), pages 1-34, January.
    2. Huang, Yuan & Li, Changcheng & Li, Runze & Yang, Songshan, 2022. "An overview of tests on high-dimensional means," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    3. Zhang, Jin-Ting & Zhou, Bu & Guo, Jia, 2022. "Linear hypothesis testing in high-dimensional heteroscedastic one-way MANOVA: A normal reference L2-norm based test," Journal of Multivariate Analysis, Elsevier, vol. 187(C).
    4. Jin-Ting Zhang & Bu Zhou & Jia Guo, 2022. "Testing high-dimensional mean vector with applications," Statistical Papers, Springer, vol. 63(4), pages 1105-1137, August.
    5. Saha, Enakshi & Sarkar, Soham & Ghosh, Anil K., 2017. "Some high-dimensional one-sample tests based on functions of interpoint distances," Journal of Multivariate Analysis, Elsevier, vol. 161(C), pages 83-95.
    6. Zhang, Yu & Feng, Long, 2024. "Adaptive rank-based tests for high dimensional mean problems," Statistics & Probability Letters, Elsevier, vol. 214(C).
    7. Li, Jun, 2023. "Finite sample t-tests for high-dimensional means," Journal of Multivariate Analysis, Elsevier, vol. 196(C).
    8. Tianming Zhu, 2025. "The general linear hypothesis testing problem for multivariate functional data with applications," Statistical Papers, Springer, vol. 66(4), pages 1-32, June.
    9. Qiu, Tao & Xu, Wangli & Zhu, Liping, 2021. "Two-sample test in high dimensions through random selection," Computational Statistics & Data Analysis, Elsevier, vol. 160(C).
    10. Zhang, Jin-Ting & Zhu, Tianming, 2022. "A new normal reference test for linear hypothesis testing in high-dimensional heteroscedastic one-way MANOVA," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    11. Wang, Wei & Lin, Nan & Tang, Xiang, 2019. "Robust two-sample test of high-dimensional mean vectors under dependence," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 312-329.
    12. Li, Yang & Wang, Zhaojun & Zou, Changliang, 2016. "A simpler spatial-sign-based two-sample test for high-dimensional data," Journal of Multivariate Analysis, Elsevier, vol. 149(C), pages 192-198.
    13. Yin, Yanqing, 2021. "Test for high-dimensional mean vector under missing observations," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
    14. Feng, Long & Zhang, Xiaoxu & Liu, Binghui, 2020. "A high-dimensional spatial rank test for two-sample location problems," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    15. Yuanyuan Jiang & Xingzhong Xu, 2022. "A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor," Mathematics, MDPI, vol. 10(10), pages 1-23, May.
    16. Harrar, Solomon W. & Kong, Xiaoli, 2022. "Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    17. Ouyang, Yanyan & Liu, Jiamin & Tong, Tiejun & Xu, Wangli, 2022. "A rank-based high-dimensional test for equality of mean vectors," Computational Statistics & Data Analysis, Elsevier, vol. 173(C).
    18. Zhang, Jie & Pan, Meng, 2016. "A high-dimension two-sample test for the mean using cluster subspaces," Computational Statistics & Data Analysis, Elsevier, vol. 97(C), pages 87-97.
    19. Jamshid Namdari & Debashis Paul & Lili Wang, 2021. "High-Dimensional Linear Models: A Random Matrix Perspective," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 83(2), pages 645-695, August.
    20. Shin-ichi Tsukada, 2019. "High dimensional two-sample test based on the inter-point distance," Computational Statistics, Springer, vol. 34(2), pages 599-615, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:39:y:2024:i:3:d:10.1007_s00180-023-01374-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.