IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v206y2025ics016794732400207x.html
   My bibliography  Save this article

An efficient and distribution-free symmetry test for high-dimensional data based on energy statistics and random projections

Author

Listed:
  • Chen, Bo
  • Chen, Feifei
  • Wang, Junxin
  • Qiu, Tao

Abstract

Testing the departures from symmetry is a critical issue in statistics. Over the last two decades, substantial effort has been invested in developing tests for central symmetry in multivariate and high-dimensional contexts. Traditional tests, which rely on Euclidean distance, face significant challenges in high-dimensional data. These tests struggle to capture overall central symmetry and are often limited to verifying whether the distribution's center aligns with the coordinate origin, a problem exacerbated by the “curse of dimensionality.” Furthermore, they tend to be computationally intensive, often making them impractical for large datasets. To overcome these limitations, we propose a nonparametric test based on the random projected energy distance, extending the energy distance test through random projections. This method effectively reduces data dimensions by projecting high-dimensional data onto lower-dimensional spaces, with the randomness ensuring maximum preservation of information. Theoretically, as the number of random projections approaches infinity, the risk of power loss from inadequate directions is mitigated. Leveraging U-statistic theory, our test's asymptotic null distribution is standard normal, which holds true regardless of the data dimensionality relative to sample size, thus eliminating the need for re-sampling to determine critical values. For computational efficiency with large datasets, we adopt a divide-and-average strategy, partitioning the data into smaller blocks of size m. Within each block, the estimates of the random projected energy distance are normally distributed. By averaging these estimates across all blocks, we derive a test statistic that is asymptotically standard normal. This method significantly reduces computational complexity from quadratic to linear in sample size, enhancing the feasibility of our test for extensive data analysis. Through extensive numerical studies, we have demonstrated the robust empirical performance of our test in terms of size and power, affirming its practical utility in statistical applications for high-dimensional data.

Suggested Citation

  • Chen, Bo & Chen, Feifei & Wang, Junxin & Qiu, Tao, 2025. "An efficient and distribution-free symmetry test for high-dimensional data based on energy statistics and random projections," Computational Statistics & Data Analysis, Elsevier, vol. 206(C).
  • Handle: RePEc:eee:csdana:v:206:y:2025:i:c:s016794732400207x
    DOI: 10.1016/j.csda.2024.108123
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S016794732400207X
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2024.108123?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Zhong, Ping-Shou & Chen, Song Xi, 2011. "Tests for High-Dimensional Regression Coefficients With Factorial Designs," Journal of the American Statistical Association, American Statistical Association, vol. 106(493), pages 260-274.
    2. Chen, Feifei & Meintanis, Simos G. & Zhu, Lixing, 2019. "On some characterizations and multidimensional criteria for testing homogeneity, symmetry and independence," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 125-144.
    3. Henderson, Daniel J. & Parmeter, Christopher F., 2015. "A consistent bootstrap procedure for nonparametric symmetry tests," Economics Letters, Elsevier, vol. 131(C), pages 78-82.
    4. Lyubchich, Vyacheslav & Wang, Xingyu & Heyes, Andrew & Gel, Yulia R., 2016. "A distribution-free m-out-of-n bootstrap approach to testing symmetry about an unknown median," Computational Statistics & Data Analysis, Elsevier, vol. 104(C), pages 1-9.
    5. Jian Yan & Xianyang Zhang, 2023. "Kernel two-sample tests in high dimensions: interplay between moment discrepancy and dimension-and-sample orders," Biometrika, Biometrika Trust, vol. 110(2), pages 411-430.
    6. Baringhaus, L. & Franz, C., 2004. "On a new multivariate two-sample test," Journal of Multivariate Analysis, Elsevier, vol. 88(1), pages 190-206, January.
    7. Koziol, James A., 1985. "A note on testing symmetry with estimated parameters," Statistics & Probability Letters, Elsevier, vol. 3(4), pages 227-230, July.
    8. Qiu, Tao & Xu, Wangli & Zhu, Liping, 2021. "Two-sample test in high dimensions through random selection," Computational Statistics & Data Analysis, Elsevier, vol. 160(C).
    9. Henze, N. & Klar, B. & Meintanis, S. G., 2003. "Invariant tests for symmetry about an unspecified point based on the empirical characteristic function," Journal of Multivariate Analysis, Elsevier, vol. 87(2), pages 275-297, November.
    10. Chen, Song Xi & Qin, Yingli, 2010. "A Two Sample Test for High Dimensional Data with Applications to Gene-set Testing," MPRA Paper 59642, University Library of Munich, Germany.
    11. Sang, Yongli, 2024. "Test for diagonal symmetry in high dimension," Statistics & Probability Letters, Elsevier, vol. 205(C).
    12. Qiu, Tao & Xu, Wangli & Zhu, Lixing, 2023. "Independence tests with random subspace of two random vectors in high dimension," Journal of Multivariate Analysis, Elsevier, vol. 195(C).
    13. Dai, Xinjie & Niu, Cuizhen & Guo, Xu, 2018. "Testing for central symmetry and inference of the unknown center," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 15-31.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sang, Yongli, 2024. "Test for diagonal symmetry in high dimension," Statistics & Probability Letters, Elsevier, vol. 205(C).
    2. Qiu, Tao & Zhang, Qintong & Fang, Yuanyuan & Xu, Wangli, 2024. "Testing homogeneity in high dimensional data through random projections," Journal of Multivariate Analysis, Elsevier, vol. 200(C).
    3. Dai, Xinjie & Niu, Cuizhen & Guo, Xu, 2018. "Testing for central symmetry and inference of the unknown center," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 15-31.
    4. Niu, Cuizhen & Guo, Xu & Li, Yong & Zhu, Lixing, 2018. "Pairwise distance-based tests for conditional symmetry," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 145-162.
    5. Yang, Weichao & Guo, Xu & Zhu, Lixing, 2024. "Tests for high-dimensional generalized linear models under general covariance structure," Computational Statistics & Data Analysis, Elsevier, vol. 199(C).
    6. Biswas, Munmun & Ghosh, Anil K., 2014. "A nonparametric two-sample test applicable to high dimensional data," Journal of Multivariate Analysis, Elsevier, vol. 123(C), pages 160-171.
    7. Bin Guo & Song Xi Chen, 2016. "Tests for high dimensional generalized linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(5), pages 1079-1102, November.
    8. Hongwei Shi & Xinyu Zhang & Xu Guo & Baihua He & Chenyang Wang, 2025. "Testing overidentifying restrictions on high-dimensional instruments and covariates," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 77(2), pages 331-352, April.
    9. Zacharias Psaradakis & Marian Vavra, 2018. "Bootstrap Assisted Tests of Symmetry for Dependent Data," Working and Discussion Papers WP 5/2018, Research Department, National Bank of Slovakia.
    10. Delgado, Miguel A. & Song, Xiaojun, 2018. "Nonparametric tests for conditional symmetry," Journal of Econometrics, Elsevier, vol. 206(2), pages 447-471.
    11. Mondal, Pronoy K. & Biswas, Munmun & Ghosh, Anil K., 2015. "On high dimensional two-sample tests based on nearest neighbors," Journal of Multivariate Analysis, Elsevier, vol. 141(C), pages 168-178.
    12. Ma, Yingying & Lan, Wei & Wang, Hansheng, 2015. "Testing predictor significance with ultra high dimensional multivariate responses," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 275-286.
    13. Tarik Bahraoui & Jean‐François Quessy, 2022. "Tests of multivariate copula exchangeability based on Lévy measures," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(3), pages 1215-1243, September.
    14. Yata, Kazuyoshi & Aoshima, Makoto, 2016. "High-dimensional inference on covariance structures via the extended cross-data-matrix methodology," Journal of Multivariate Analysis, Elsevier, vol. 151(C), pages 151-166.
    15. Wang, Siyang & Cui, Hengjian, 2013. "Generalized F test for high dimensional linear regression coefficients," Journal of Multivariate Analysis, Elsevier, vol. 117(C), pages 134-149.
    16. Zang, Yangguang & Zhang, Sanguo & Li, Qizhai & Zhang, Qingzhao, 2016. "Jackknife empirical likelihood test for high-dimensional regression coefficients," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 302-316.
    17. Xu, Kai & Tian, Yan & He, Daojiang, 2021. "A high dimensional nonparametric test for proportional covariance matrices," Journal of Multivariate Analysis, Elsevier, vol. 184(C).
    18. Harrar, Solomon W. & Kong, Xiaoli, 2022. "Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    19. Xu, Kai & Hao, Xinxin, 2019. "A nonparametric test for block-diagonal covariance structure in high dimension and small samples," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 551-567.
    20. James S. Allison & Charl Pretorius, 2017. "A Monte Carlo evaluation of the performance of two new tests for symmetry," Computational Statistics, Springer, vol. 32(4), pages 1323-1338, December.

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:206:y:2025:i:c:s016794732400207x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.