IDEAS home Printed from https://ideas.repec.org/a/spr/alstar/v105y2021i3d10.1007_s10182-020-00378-1.html
   My bibliography  Save this article

The exact equivalence of distance and kernel methods in hypothesis testing

Author

Listed:
  • Cencheng Shen

    (University of Delaware)

  • Joshua T. Vogelstein

    (Johns Hopkins University
    Johns Hopkins University)

Abstract

Distance correlation and Hilbert-Schmidt independence criterion are widely used for independence testing, two-sample testing, and many inference tasks in statistics and machine learning. These two methods are tightly related, yet are treated as two different entities in the majority of existing literature. In this paper, we propose a simple and elegant bijection between metric and kernel. The bijective transformation better preserves the similarity structure, allows distance correlation and Hilbert-Schmidt independence criterion to be always the same for hypothesis testing, streamlines the code base for implementation, and enables a rich literature of distance-based and kernel-based methodologies to directly communicate with each other.

Suggested Citation

  • Cencheng Shen & Joshua T. Vogelstein, 2021. "The exact equivalence of distance and kernel methods in hypothesis testing," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 105(3), pages 385-403, September.
  • Handle: RePEc:spr:alstar:v:105:y:2021:i:3:d:10.1007_s10182-020-00378-1
    DOI: 10.1007/s10182-020-00378-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10182-020-00378-1
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10182-020-00378-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Cencheng Shen & Carey E. Priebe & Joshua T. Vogelstein, 2020. "From Distance Correlation to Multiscale Graph Correlation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(529), pages 280-291, January.
    2. Zhou Zhou, 2012. "Measuring nonlinear dependence in time‐series, a distance correlation approach," Journal of Time Series Analysis, Wiley Blackwell, vol. 33(3), pages 438-457, May.
    3. Youjin Lee & Cencheng Shen & Carey E Priebe & Joshua T Vogelstein, 2019. "Network dependence testing via diffusion maps and distance-based correlations," Biometrika, Biometrika Trust, vol. 106(4), pages 857-873.
    4. Liping Zhu & Kai Xu & Runze Li & Wei Zhong, 2017. "Projection correlation between two random vectors," Biometrika, Biometrika Trust, vol. 104(4), pages 829-843.
    5. Runze Li & Wei Zhong & Liping Zhu, 2012. "Feature Screening via Distance Correlation Learning," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(499), pages 1129-1139, September.
    6. Xueqin Wang & Wenliang Pan & Wenhao Hu & Yuan Tian & Heping Zhang, 2015. "Conditional Distance Correlation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1726-1734, December.
    7. K Fokianos & M Pitsillou, 2018. "Testing independence for multivariate time series via the auto-distance correlation matrix," Biometrika, Biometrika Trust, vol. 105(2), pages 337-352.
    8. Ruth Heller & Yair Heller & Malka Gorfine, 2013. "A consistent multivariate test of association based on ranks of distances," Biometrika, Biometrika Trust, vol. 100(2), pages 503-510.
    9. Gabor J. Szekely & Maria L. Rizzo, 2005. "Hierarchical Clustering via Joint Between-Within Distances: Extending Ward's Minimum Variance Method," Journal of Classification, Springer;The Classification Society, vol. 22(2), pages 151-183, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhou, Yeqing & Liu, Jingyuan & Zhu, Liping, 2020. "Test for conditional independence with application to conditional screening," Journal of Multivariate Analysis, Elsevier, vol. 175(C).
    2. Chu, Ba, 2023. "A distance-based test of independence between two multivariate time series," Journal of Multivariate Analysis, Elsevier, vol. 195(C).
    3. Fan, Jinlin & Zhang, Yaowu & Zhu, Liping, 2022. "Independence tests in the presence of measurement errors: An invariance law," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    4. Ke, Chenlu & Yang, Wei & Yuan, Qingcong & Li, Lu, 2023. "Partial sufficient variable screening with categorical controls," Computational Statistics & Data Analysis, Elsevier, vol. 187(C).
    5. Hongjian Shi & Marc Hallin & Mathias Drton & Fang Han, 2020. "Rate-Optimality of Consistent Distribution-Free Tests of Independence Based on Center-Outward Ranks and Signs," Working Papers ECARES 2020-23, ULB -- Universite Libre de Bruxelles.
    6. Simos G. Meintanis & Joseph Ngatchou-Wandji & James Allison, 2018. "Testing for serial independence in vector autoregressive models," Statistical Papers, Springer, vol. 59(4), pages 1379-1410, December.
    7. Xin Dang & Dao Nguyen & Yixin Chen & Junying Zhang, 2021. "A new Gini correlation between quantitative and qualitative variables," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(4), pages 1314-1343, December.
    8. Matsui, Muneya & Mikosch, Thomas & Roozegar, Rasool & Tafakori, Laleh, 2022. "Distance covariance for random fields," Stochastic Processes and their Applications, Elsevier, vol. 150(C), pages 280-322.
    9. Yi Liu & Qihua Wang, 2018. "Model-free feature screening for ultrahigh-dimensional data conditional on some variables," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(2), pages 283-301, April.
    10. Lai, Tingyu & Zhang, Zhongzhan & Wang, Yafei & Kong, Linglong, 2021. "Testing independence of functional variables by angle covariance," Journal of Multivariate Analysis, Elsevier, vol. 182(C).
    11. Zhang, Wei & Gao, Wei & Ng, Hon Keung Tony, 2023. "Multivariate tests of independence based on a new class of measures of independence in Reproducing Kernel Hilbert Space," Journal of Multivariate Analysis, Elsevier, vol. 195(C).
    12. Dueck, Johannes & Edelmann, Dominic & Richards, Donald, 2015. "A generalization of an integral arising in the theory of distance correlation," Statistics & Probability Letters, Elsevier, vol. 97(C), pages 116-119.
    13. Chaudhuri, Arin & Hu, Wenhao, 2019. "A fast algorithm for computing distance correlation," Computational Statistics & Data Analysis, Elsevier, vol. 135(C), pages 15-24.
    14. L Weihs & M Drton & N Meinshausen, 2018. "Symmetric rank covariances: a generalized framework for nonparametric measures of dependence," Biometrika, Biometrika Trust, vol. 105(3), pages 547-562.
    15. Jun Lu & Lu Lin, 2020. "Model-free conditional screening via conditional distance correlation," Statistical Papers, Springer, vol. 61(1), pages 225-244, February.
    16. Yuan, Qingcong & Chen, Xianyan & Ke, Chenlu & Yin, Xiangrong, 2022. "Independence index sufficient variable screening for categorical responses," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    17. Liu, Jicai & Si, Yuefeng & Niu, Yong & Zhang, Riquan, 2022. "Projection quantile correlation and its use in high-dimensional grouped variable screening," Computational Statistics & Data Analysis, Elsevier, vol. 167(C).
    18. Guochang Wang & Wai Keung Li & Ke Zhu, 2018. "New HSIC-based tests for independence between two stationary multivariate time series," Papers 1804.09866, arXiv.org.
    19. Jozef Baruník & Tobias Kley, 2019. "Quantile coherency: A general measure for dependence between cyclical economic variables," The Econometrics Journal, Royal Economic Society, vol. 22(2), pages 131-152.
    20. Jia Zhu & Xingcheng Wu & Xueqin Lin & Changqin Huang & Gabriel Pui Cheong Fung & Yong Tang, 2018. "A novel multiple layers name disambiguation framework for digital libraries using dynamic clustering," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 781-794, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:alstar:v:105:y:2021:i:3:d:10.1007_s10182-020-00378-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.