IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/59815.html
   My bibliography  Save this paper

Two-Sample Tests for High Dimensional Means with Thresholding and Data Transformation

Author

Listed:
  • Chen, Song Xi
  • Li, Jun
  • Zhong, Pingshou

Abstract

We study two tests for the equality of two population mean vectors under high dimensionality and column-wise dependence by thresholding. They are designed for better power performance when the mean vectors of two populations differ only in sparsely populated coordinates. The first test is constructed by carrying out thresholding to remove those no-signal bearing dimensions. The second test combines data transformation and thresholding by first transforming the data with the precision matrix followed by thresholding. The benefits of the threshodling and the data transformations are demonstrated in terms of reduced variance of the test statistics and the improved power of the tests. Numerical analyses and empirical study are performed to confirm the theoretical findings and to demonstrate the practical implementations.

Suggested Citation

  • Chen, Song Xi & Li, Jun & Zhong, Pingshou, 2014. "Two-Sample Tests for High Dimensional Means with Thresholding and Data Transformation," MPRA Paper 59815, University Library of Munich, Germany.
  • Handle: RePEc:pra:mprapa:59815
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/59815/1/MPRA_paper_59815.pdf
    File Function: original version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Srivastava, Muni S., 2009. "A test for the mean vector with fewer observations than the dimension under non-normality," Journal of Multivariate Analysis, Elsevier, vol. 100(3), pages 518-532, March.
    2. Chen, Song Xi & Qin, Yingli, 2010. "A Two Sample Test for High Dimensional Data with Applications to Gene-set Testing," MPRA Paper 59642, University Library of Munich, Germany.
    3. T. Tony Cai & Weidong Liu & Yin Xia, 2014. "Two-sample test of high dimensional means under dependence," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(2), pages 349-372, March.
    4. Jianhua Z. Huang & Naiping Liu & Mohsen Pourahmadi & Linxu Liu, 2006. "Covariance matrix selection and estimation via penalised normal likelihood," Biometrika, Biometrika Trust, vol. 93(1), pages 85-98, March.
    5. Wei Biao Wu, 2003. "Nonparametric estimation of large covariance matrices of longitudinal data," Biometrika, Biometrika Trust, vol. 90(4), pages 831-844, December.
    6. Aurore Delaigle & Peter Hall & Jiashun Jin, 2011. "Robustness and accuracy of methods for high dimensional data analysis based on Student's t‐statistic," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(3), pages 283-301, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wang, Wei & Lin, Nan & Tang, Xiang, 2019. "Robust two-sample test of high-dimensional mean vectors under dependence," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 312-329.
    2. Zhang, Jin-Ting & Guo, Jia & Zhou, Bu, 2017. "Linear hypothesis testing in high-dimensional one-way MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 155(C), pages 200-216.
    3. Shen, Yanfeng & Lin, Zhengyan, 2015. "An adaptive test for the mean vector in large-p-small-n problems," Computational Statistics & Data Analysis, Elsevier, vol. 89(C), pages 25-38.
    4. Gongjun Xu & Lifeng Lin & Peng Wei & Wei Pan, 2016. "An adaptive two-sample test for high-dimensional means," Biometrika, Biometrika Trust, vol. 103(3), pages 609-624.
    5. Zhang, Jie & Pan, Meng, 2016. "A high-dimension two-sample test for the mean using cluster subspaces," Computational Statistics & Data Analysis, Elsevier, vol. 97(C), pages 87-97.
    6. Ayyala, Deepak Nag & Park, Junyong & Roy, Anindya, 2017. "Mean vector testing for high-dimensional dependent observations," Journal of Multivariate Analysis, Elsevier, vol. 153(C), pages 136-155.
    7. Anders Bredahl Kock & David Preinerstorfer, 2019. "Power in High‐Dimensional Testing Problems," Econometrica, Econometric Society, vol. 87(3), pages 1055-1069, May.
    8. Hyodo, Masashi & Watanabe, Hiroki & Seo, Takashi, 2018. "On simultaneous confidence interval estimation for the difference of paired mean vectors in high-dimensional settings," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 160-173.
    9. He, Yong & Zhang, Mingjuan & Zhang, Xinsheng & Zhou, Wang, 2020. "High-dimensional two-sample mean vectors test and support recovery with factor adjustment," Computational Statistics & Data Analysis, Elsevier, vol. 151(C).
    10. Zhao, Junguang & Xu, Xingzhong, 2016. "A generalized likelihood ratio test for normal mean when p is greater than n," Computational Statistics & Data Analysis, Elsevier, vol. 99(C), pages 91-104.
    11. Cai, T. Tony & Xia, Yin, 2014. "High-dimensional sparse MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 174-196.
    12. Ma, Yingying & Lan, Wei & Wang, Hansheng, 2015. "A high dimensional two-sample test under a low dimensional factor structure," Journal of Multivariate Analysis, Elsevier, vol. 140(C), pages 162-170.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. He, Yong & Zhang, Mingjuan & Zhang, Xinsheng & Zhou, Wang, 2020. "High-dimensional two-sample mean vectors test and support recovery with factor adjustment," Computational Statistics & Data Analysis, Elsevier, vol. 151(C).
    2. Amanda Plunkett & Junyong Park, 2019. "Two-sample test for sparse high-dimensional multinomial distributions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(3), pages 804-826, September.
    3. Ayyala, Deepak Nag & Park, Junyong & Roy, Anindya, 2017. "Mean vector testing for high-dimensional dependent observations," Journal of Multivariate Analysis, Elsevier, vol. 153(C), pages 136-155.
    4. Cai, T. Tony & Xia, Yin, 2014. "High-dimensional sparse MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 174-196.
    5. Yuanyuan Jiang & Xingzhong Xu, 2022. "A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor," Mathematics, MDPI, vol. 10(10), pages 1-23, May.
    6. Huang, Yuan & Li, Changcheng & Li, Runze & Yang, Songshan, 2022. "An overview of tests on high-dimensional means," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    7. Harrar, Solomon W. & Kong, Xiaoli, 2022. "Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    8. Zhao, Junguang & Xu, Xingzhong, 2016. "A generalized likelihood ratio test for normal mean when p is greater than n," Computational Statistics & Data Analysis, Elsevier, vol. 99(C), pages 91-104.
    9. Chen, Songxi, 2012. "Two Sample Tests for High Dimensional Covariance Matrices," MPRA Paper 46026, University Library of Munich, Germany.
    10. Lam, Clifford, 2008. "Estimation of large precision matrices through block penalization," LSE Research Online Documents on Economics 31543, London School of Economics and Political Science, LSE Library.
    11. Saha, Enakshi & Sarkar, Soham & Ghosh, Anil K., 2017. "Some high-dimensional one-sample tests based on functions of interpoint distances," Journal of Multivariate Analysis, Elsevier, vol. 161(C), pages 83-95.
    12. Zhang, Jie & Pan, Meng, 2016. "A high-dimension two-sample test for the mean using cluster subspaces," Computational Statistics & Data Analysis, Elsevier, vol. 97(C), pages 87-97.
    13. Shin-ichi Tsukada, 2019. "High dimensional two-sample test based on the inter-point distance," Computational Statistics, Springer, vol. 34(2), pages 599-615, June.
    14. Gautam Sabnis & Debdeep Pati & Anirban Bhattacharya, 2019. "Compressed Covariance Estimation with Automated Dimension Learning," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 81(2), pages 466-481, December.
    15. Jiang, Feiyu & Wang, Runmin & Shao, Xiaofeng, 2023. "Robust inference for change points in high dimension," Journal of Multivariate Analysis, Elsevier, vol. 193(C).
    16. Li, Jun, 2023. "Finite sample t-tests for high-dimensional means," Journal of Multivariate Analysis, Elsevier, vol. 196(C).
    17. Xiaoping Zhou & Dmitry Malioutov & Frank J. Fabozzi & Svetlozar T. Rachev, 2014. "Smooth monotone covariance for elliptical distributions and applications in finance," Quantitative Finance, Taylor & Francis Journals, vol. 14(9), pages 1555-1571, September.
    18. M. Ahmad, 2014. "A $$U$$ -statistic approach for a high-dimensional two-sample mean testing problem under non-normality and Behrens–Fisher setting," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 66(1), pages 33-61, February.
    19. Chi, Eric C. & Lange, Kenneth, 2014. "Stable estimation of a covariance matrix guided by nuclear norm penalties," Computational Statistics & Data Analysis, Elsevier, vol. 80(C), pages 117-128.
    20. Lopes, Hedibert F. & McCulloch, Robert E. & Tsay, Ruey S., 2022. "Parsimony inducing priors for large scale state–space models," Journal of Econometrics, Elsevier, vol. 230(1), pages 39-61.

    More about this item

    Keywords

    Data Transformation; Large deviation; Large p small n; Sparse signals; Thresholding.;
    All these keywords.

    JEL classification:

    • C0 - Mathematical and Quantitative Methods - - General
    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General
    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:59815. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.