IDEAS home Printed from https://ideas.repec.org/a/spr/testjl/v28y2019i3d10.1007_s11749-018-0600-8.html
   My bibliography  Save this article

Two-sample test for sparse high-dimensional multinomial distributions

Author

Listed:
  • Amanda Plunkett

    (Department of Defense)

  • Junyong Park

    (University of Maryland Baltimore County)

Abstract

In this paper we consider testing the equality of probability vectors of two independent multinomial distributions in high dimension. The classical Chi-square test may have some drawbacks in this case since many of cell counts may be zero or may not be large enough. We propose a new test and show its asymptotic normality and the asymptotic power function. Based on the asymptotic power function, we present an application of our result to a neighborhood-type test which has been previously studied, especially for the case of fairly small p values. To compare the proposed test with existing tests, we provide numerical studies including simulations and real data examples.

Suggested Citation

  • Amanda Plunkett & Junyong Park, 2019. "Two-sample test for sparse high-dimensional multinomial distributions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(3), pages 804-826, September.
  • Handle: RePEc:spr:testjl:v:28:y:2019:i:3:d:10.1007_s11749-018-0600-8
    DOI: 10.1007/s11749-018-0600-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11749-018-0600-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11749-018-0600-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Srivastava, Muni S., 2009. "A test for the mean vector with fewer observations than the dimension under non-normality," Journal of Multivariate Analysis, Elsevier, vol. 100(3), pages 518-532, March.
    2. Chen, Song Xi & Qin, Yingli, 2010. "A Two Sample Test for High Dimensional Data with Applications to Gene-set Testing," MPRA Paper 59642, University Library of Munich, Germany.
    3. T. Tony Cai & Weidong Liu & Yin Xia, 2014. "Two-sample test of high dimensional means under dependence," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(2), pages 349-372, March.
    4. Munk, A. & Paige, R. & Pang, J. & Patrangenaru, V. & Ruymgaart, F., 2008. "The one- and multi-sample problem for functional data with application to projective shape analysis," Journal of Multivariate Analysis, Elsevier, vol. 99(5), pages 815-833, May.
    5. Srivastava, Muni S. & Katayama, Shota & Kano, Yutaka, 2013. "A two sample test in high dimensional data," Journal of Multivariate Analysis, Elsevier, vol. 114(C), pages 349-358.
    6. Junyong Park & Bimal Sinha & Arvind Shah & Dihua Xu & Jianxin Lin, 2015. "Likelihood Ratio Tests for Interval Hypotheses with Applications," Communications in Statistics - Theory and Methods, Taylor & Francis Journals, vol. 44(11), pages 2351-2370, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pini, Alessia & Stamm, Aymeric & Vantini, Simone, 2018. "Hotelling’s T2 in separable Hilbert spaces," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 284-305.
    2. Ayyala, Deepak Nag & Park, Junyong & Roy, Anindya, 2017. "Mean vector testing for high-dimensional dependent observations," Journal of Multivariate Analysis, Elsevier, vol. 153(C), pages 136-155.
    3. Yuanyuan Jiang & Xingzhong Xu, 2022. "A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor," Mathematics, MDPI, vol. 10(10), pages 1-23, May.
    4. Harrar, Solomon W. & Kong, Xiaoli, 2022. "Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    5. Shin-ichi Tsukada, 2019. "High dimensional two-sample test based on the inter-point distance," Computational Statistics, Springer, vol. 34(2), pages 599-615, June.
    6. Zhang, Jin-Ting & Guo, Jia & Zhou, Bu, 2017. "Linear hypothesis testing in high-dimensional one-way MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 155(C), pages 200-216.
    7. Anil K. Ghosh & Munmun Biswas, 2016. "Distribution-free high-dimensional two-sample tests based on discriminating hyperplanes," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(3), pages 525-547, September.
    8. Zhang, Jin-Ting & Zhu, Tianming, 2022. "A new normal reference test for linear hypothesis testing in high-dimensional heteroscedastic one-way MANOVA," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    9. Ma, Yingying & Lan, Wei & Wang, Hansheng, 2015. "A high dimensional two-sample test under a low dimensional factor structure," Journal of Multivariate Analysis, Elsevier, vol. 140(C), pages 162-170.
    10. Jiang Hu & Zhidong Bai & Chen Wang & Wei Wang, 2017. "On testing the equality of high dimensional mean vectors with unequal covariance matrices," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(2), pages 365-387, April.
    11. Dong, Kai & Pang, Herbert & Tong, Tiejun & Genton, Marc G., 2016. "Shrinkage-based diagonal Hotelling’s tests for high-dimensional small sample size data," Journal of Multivariate Analysis, Elsevier, vol. 143(C), pages 127-142.
    12. Yin, Yanqing, 2021. "Test for high-dimensional mean vector under missing observations," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
    13. Zhengbang Li & Fuxiang Liu & Luanjie Zeng & Guoxin Zuo, 2021. "A stationary bootstrap test about two mean vectors comparison with somewhat dense differences and fewer sample size than dimension," Computational Statistics, Springer, vol. 36(2), pages 941-960, June.
    14. Cai, T. Tony & Xia, Yin, 2014. "High-dimensional sparse MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 174-196.
    15. Hyodo, Masashi & Watanabe, Hiroki & Seo, Takashi, 2018. "On simultaneous confidence interval estimation for the difference of paired mean vectors in high-dimensional settings," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 160-173.
    16. Davy Paindaveine & Thomas Verdebout, 2013. "Universal Asymptotics for High-Dimensional Sign Tests," Working Papers ECARES ECARES 2013-40, ULB -- Universite Libre de Bruxelles.
    17. Huang, Yuan & Li, Changcheng & Li, Runze & Yang, Songshan, 2022. "An overview of tests on high-dimensional means," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    18. Ghosh, Santu & Ayyala, Deepak Nag & Hellebuyck, Rafael, 2021. "Two-sample high dimensional mean test based on prepivots," Computational Statistics & Data Analysis, Elsevier, vol. 163(C).
    19. Huiqin Li & Jiang Hu & Zhidong Bai & Yanqing Yin & Kexin Zou, 2017. "Test on the linear combinations of mean vectors in high-dimensional data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 26(1), pages 188-208, March.
    20. Zhao, Junguang & Xu, Xingzhong, 2016. "A generalized likelihood ratio test for normal mean when p is greater than n," Computational Statistics & Data Analysis, Elsevier, vol. 99(C), pages 91-104.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:testjl:v:28:y:2019:i:3:d:10.1007_s11749-018-0600-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.