IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v188y2022ics0047259x21001330.html
   My bibliography  Save this article

Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches

Author

Listed:
  • Harrar, Solomon W.
  • Kong, Xiaoli

Abstract

In this paper, we give the most current account of methods for comparison of populations or treatment groups with high-dimensional data. We conveniently group the methods into three categories based on the hypothesis of interest and the model assumptions they make. We offer some perspectives on the connections and distinctions among the tests and discuss the ramifications of the model assumptions for practical applications. Among other things, we discuss the interpretation of the hypotheses and results of the appropriate tests and how this distinguishes the methods in terms of what data type they are suitable for. Further, we provide a discussion of computational complexity and a list of available R-packages implementations and their limitations. Finally, we illustrate the numerical performances of the various tests in a simulation study.

Suggested Citation

  • Harrar, Solomon W. & Kong, Xiaoli, 2022. "Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
  • Handle: RePEc:eee:jmvana:v:188:y:2022:i:c:s0047259x21001330
    DOI: 10.1016/j.jmva.2021.104855
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X21001330
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2021.104855?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Long Feng & Changliang Zou & Zhaojun Wang, 2016. "Multivariate-Sign-Based High-Dimensional Tests for the Two-Sample Location Problem," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(514), pages 721-735, April.
    2. Thulin, Måns, 2014. "A high-dimensional two-sample test for the mean using random subspaces," Computational Statistics & Data Analysis, Elsevier, vol. 74(C), pages 26-38.
    3. Bathke, Arne C. & Harrar, Solomon W. & Madden, Laurence V., 2008. "How to compare small multivariate samples using nonparametric tests," Computational Statistics & Data Analysis, Elsevier, vol. 52(11), pages 4951-4965, July.
    4. Zhang, Huaiyu & Wang, Haiyan, 2021. "A more powerful test of equality of high-dimensional two-sample means," Computational Statistics & Data Analysis, Elsevier, vol. 164(C).
    5. Anil K. Ghosh & Munmun Biswas, 2016. "Distribution-free high-dimensional two-sample tests based on discriminating hyperplanes," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(3), pages 525-547, September.
    6. Feng, Long & Zhang, Xiaoxu & Liu, Binghui, 2020. "A high-dimensional spatial rank test for two-sample location problems," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    7. Ma, Yingying & Lan, Wei & Wang, Hansheng, 2015. "A high dimensional two-sample test under a low dimensional factor structure," Journal of Multivariate Analysis, Elsevier, vol. 140(C), pages 162-170.
    8. Lan Wang & Bo Peng & Runze Li, 2015. "A High-Dimensional Nonparametric Multivariate Test for Mean Vector," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1658-1669, December.
    9. Solomon Harrar & Arjun Gupta, 2007. "Asymptotic Expansion for the Null Distribution of the F-statistic in One-way ANOVA under Non-normality," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 59(3), pages 531-556, September.
    10. Zhang, Jie & Pan, Meng, 2016. "A high-dimension two-sample test for the mean using cluster subspaces," Computational Statistics & Data Analysis, Elsevier, vol. 97(C), pages 87-97.
    11. Solomon Harrar & Arne Bathke, 2012. "A modified two-factor multivariate analysis of variance: asymptotics and small sample approximations," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 64(1), pages 135-165, February.
    12. Liu, Chunxu & Bathke, Arne C. & Harrar, Solomon W., 2011. "A nonparametric version of Wilks' lambda--Asymptotic results and small sample approximations," Statistics & Probability Letters, Elsevier, vol. 81(10), pages 1502-1506, October.
    13. Srivastava, Muni S., 2009. "A test for the mean vector with fewer observations than the dimension under non-normality," Journal of Multivariate Analysis, Elsevier, vol. 100(3), pages 518-532, March.
    14. Qiu, Tao & Xu, Wangli & Zhu, Liping, 2021. "Two-sample test in high dimensions through random selection," Computational Statistics & Data Analysis, Elsevier, vol. 160(C).
    15. Zongliang Hu & Tiejun Tong & Marc G. Genton, 2019. "Diagonal likelihood ratio test for equality of mean vectors in high‐dimensional data," Biometrics, The International Biometric Society, vol. 75(1), pages 256-267, March.
    16. Chen, Songxi, 2012. "Two Sample Tests for High Dimensional Covariance Matrices," MPRA Paper 46026, University Library of Munich, Germany.
    17. Dennis Dobler & Sarah Friedrich & Markus Pauly, 2020. "Nonparametric MANOVA in meaningful effects," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(4), pages 997-1022, August.
    18. Edgar Brunner & Frank Konietschke & Markus Pauly & Madan L. Puri, 2017. "Rank-based procedures in factorial designs: hypotheses about non-parametric treatment effects," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1463-1485, November.
    19. Krishnamoorthy, K. & Yu, Jianqi, 2004. "Modified Nel and Van der Merwe test for the multivariate Behrens-Fisher problem," Statistics & Probability Letters, Elsevier, vol. 66(2), pages 161-169, January.
    20. Thompson, G. L., 1990. "Asymptotic distribution of rank statistics under dependencies with multivariate application," Journal of Multivariate Analysis, Elsevier, vol. 33(2), pages 183-211, May.
    21. Arjun Gupta & Solomon Harrar & Yasunori Fujikoshi, 2008. "MANOVA for large hypothesis degrees of freedom under non-normality," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 17(1), pages 120-137, May.
    22. Chen, Song Xi & Qin, Yingli, 2010. "A Two Sample Test for High Dimensional Data with Applications to Gene-set Testing," MPRA Paper 59642, University Library of Munich, Germany.
    23. T. Tony Cai & Weidong Liu & Yin Xia, 2014. "Two-sample test of high dimensional means under dependence," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(2), pages 349-372, March.
    24. Haiyan Wang & Michael Akritas, 2010. "Inference from heteroscedastic functional data," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 22(2), pages 149-168.
    25. Karl Bruce Gregory & Raymond J. Carroll & Veerabhadran Baladandayuthapani & Soumendra N. Lahiri, 2015. "A Two-Sample Test for Equality of Means in High Dimension," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(510), pages 837-849, June.
    26. Feng, Long & Sun, Fasheng, 2015. "A note on high-dimensional two-sample test," Statistics & Probability Letters, Elsevier, vol. 105(C), pages 29-36.
    27. Jiang Hu & Zhidong Bai & Chen Wang & Wei Wang, 2017. "On testing the equality of high dimensional mean vectors with unequal covariance matrices," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(2), pages 365-387, April.
    28. Harrar, Solomon W. & Bathke, Arne C., 2008. "Nonparametric methods for unbalanced multivariate data and many factor levels," Journal of Multivariate Analysis, Elsevier, vol. 99(8), pages 1635-1664, September.
    29. Wang, Haiyan & Akritas, Michael G., 2010. "Rank test for heteroscedastic functional data," Journal of Multivariate Analysis, Elsevier, vol. 101(8), pages 1791-1805, September.
    30. Wang, Rui & Xu, Xingzhong, 2018. "On two-sample mean tests under spiked covariances," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 225-249.
    31. Chao Zhang & Zhidong Bai & Jiang Hu & Chen Wang, 2018. "Multi-sample test for high-dimensional covariance matrices," Communications in Statistics - Theory and Methods, Taylor & Francis Journals, vol. 47(13), pages 3161-3177, July.
    32. Chen, Song Xi & Zhang, Li-Xin & Zhong, Ping-Shou, 2010. "Tests for High-Dimensional Covariance Matrices," Journal of the American Statistical Association, American Statistical Association, vol. 105(490), pages 810-819.
    33. Xiaoli Kong & Solomon W. Harrar, 2020. "High-dimensional rank-based inference," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 32(2), pages 294-322, April.
    34. Schott, James R., 2007. "Some high-dimensional tests for a one-way MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 98(9), pages 1825-1839, October.
    35. Yamada, Takayuki & Himeno, Tetsuto, 2015. "Testing homogeneity of mean vectors under heteroscedasticity in high-dimension," Journal of Multivariate Analysis, Elsevier, vol. 139(C), pages 7-27.
    36. Brunner, Edgar & Munzel, Ulrich & Puri, Madan L., 1999. "Rank-Score Tests in Factorial Designs with Repeated Measures," Journal of Multivariate Analysis, Elsevier, vol. 70(2), pages 286-317, August.
    37. Gupta, Arjun K. & Harrar, Solomon W. & Fujikoshi, Yasunori, 2006. "Asymptotics for testing hypothesis in some multivariate variance components model under non-normality," Journal of Multivariate Analysis, Elsevier, vol. 97(1), pages 148-178, January.
    38. Cai, T. Tony & Xia, Yin, 2014. "High-dimensional sparse MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 174-196.
    39. Srivastava, Muni S. & Kubokawa, Tatsuya, 2013. "Tests for multivariate analysis of variance in high dimension under non-normality," Journal of Multivariate Analysis, Elsevier, vol. 115(C), pages 204-216.
    40. M. Ahmad, 2014. "A $$U$$ -statistic approach for a high-dimensional two-sample mean testing problem under non-normality and Behrens–Fisher setting," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 66(1), pages 33-61, February.
    41. Srivastava, Muni S. & Du, Meng, 2008. "A test for the mean vector with fewer observations than the dimension," Journal of Multivariate Analysis, Elsevier, vol. 99(3), pages 386-402, March.
    42. Hyodo, Masashi & Watanabe, Hiroki & Seo, Takashi, 2018. "On simultaneous confidence interval estimation for the difference of paired mean vectors in high-dimensional settings," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 160-173.
    43. Burchett, Woodrow W. & Ellis, Amanda R. & Harrar, Solomon W. & Bathke, Arne C., 2017. "Nonparametric Inference for Multivariate Data: The R Package npmv," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 76(i04).
    44. Srivastava, Muni S. & Katayama, Shota & Kano, Yutaka, 2013. "A two sample test in high dimensional data," Journal of Multivariate Analysis, Elsevier, vol. 114(C), pages 349-358.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Miyazaki, Izuru, 2023. "Recovery of partly sparse and dense signals," Journal of Multivariate Analysis, Elsevier, vol. 195(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhang, Jin-Ting & Guo, Jia & Zhou, Bu, 2017. "Linear hypothesis testing in high-dimensional one-way MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 155(C), pages 200-216.
    2. Zhang, Jin-Ting & Zhou, Bu & Guo, Jia, 2022. "Linear hypothesis testing in high-dimensional heteroscedastic one-way MANOVA: A normal reference L2-norm based test," Journal of Multivariate Analysis, Elsevier, vol. 187(C).
    3. Pini, Alessia & Stamm, Aymeric & Vantini, Simone, 2018. "Hotelling’s T2 in separable Hilbert spaces," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 284-305.
    4. Jiang Hu & Zhidong Bai & Chen Wang & Wei Wang, 2017. "On testing the equality of high dimensional mean vectors with unequal covariance matrices," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(2), pages 365-387, April.
    5. Yuanyuan Jiang & Xingzhong Xu, 2022. "A Two-Sample Test of High Dimensional Means Based on Posterior Bayes Factor," Mathematics, MDPI, vol. 10(10), pages 1-23, May.
    6. Davy Paindaveine & Thomas Verdebout, 2013. "Universal Asymptotics for High-Dimensional Sign Tests," Working Papers ECARES ECARES 2013-40, ULB -- Universite Libre de Bruxelles.
    7. Huang, Yuan & Li, Changcheng & Li, Runze & Yang, Songshan, 2022. "An overview of tests on high-dimensional means," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    8. Huiqin Li & Jiang Hu & Zhidong Bai & Yanqing Yin & Kexin Zou, 2017. "Test on the linear combinations of mean vectors in high-dimensional data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 26(1), pages 188-208, March.
    9. Zhang, Jin-Ting & Zhu, Tianming, 2022. "A new normal reference test for linear hypothesis testing in high-dimensional heteroscedastic one-way MANOVA," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    10. Jin-Ting Zhang & Bu Zhou & Jia Guo, 2022. "Testing high-dimensional mean vector with applications," Statistical Papers, Springer, vol. 63(4), pages 1105-1137, August.
    11. Tzviel Frostig & Yoav Benjamini, 2022. "Testing the equality of multivariate means when $$p>n$$ p > n by combining the Hotelling and Simes tests," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(2), pages 390-415, June.
    12. Friedrich, Sarah & Pauly, Markus, 2018. "MATS: Inference for potentially singular and heteroscedastic MANOVA," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 166-179.
    13. Ley, Christophe & Paindaveine, Davy & Verdebout, Thomas, 2015. "High-dimensional tests for spherical location and spiked covariance," Journal of Multivariate Analysis, Elsevier, vol. 139(C), pages 79-91.
    14. Feng, Long & Sun, Fasheng, 2015. "A note on high-dimensional two-sample test," Statistics & Probability Letters, Elsevier, vol. 105(C), pages 29-36.
    15. Yin, Yanqing, 2021. "Test for high-dimensional mean vector under missing observations," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
    16. Feng, Long & Zhang, Xiaoxu & Liu, Binghui, 2020. "A high-dimensional spatial rank test for two-sample location problems," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    17. Li, Jun, 2023. "Finite sample t-tests for high-dimensional means," Journal of Multivariate Analysis, Elsevier, vol. 196(C).
    18. Muni S. Srivastava & Hirokazu Yanagihara & Tatsuya Kubokawa, 2014. "Tests for Covariance Matrices in High Dimension with Less Sample Size," CIRJE F-Series CIRJE-F-933, CIRJE, Faculty of Economics, University of Tokyo.
    19. M. Rauf Ahmad, 2019. "A unified approach to testing mean vectors with large dimensions," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 103(4), pages 593-618, December.
    20. Hyodo, Masashi & Watanabe, Hiroki & Seo, Takashi, 2018. "On simultaneous confidence interval estimation for the difference of paired mean vectors in high-dimensional settings," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 160-173.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:188:y:2022:i:c:s0047259x21001330. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.