IDEAS home Printed from https://ideas.repec.org/p/ehl/lserod/125584.html
   My bibliography  Save this paper

A simple measure of conditional dependence

Author

Listed:
  • Azadkia, Mona
  • Chatterjee, Sourav

Abstract

We propose a coefficient of conditional dependence between two random variables Y and Z given a set of other variables X1, . . . , Xp, based on an i.i.d. sample. The coefficient has a long list of desirable properties, the most important of which is that under absolutely no distributional assumptions, it converges to a limit in [0, 1], where the limit is 0 if and only if Y and Z are conditionally independent given X1, . . . , Xp, and is 1 if and only if Y is equal to a measurable function of Z given X1, . . . , Xp. Moreover, it has a natural interpretation as a nonlinear generalization of the familiar partial R2 statistic for measuring conditional dependence by regression. Using this statistic, we devise a new variable selection algorithm, called Feature Ordering by Conditional Independence (FOCI), which is model-free, has no tuning parameters, and is provably consistent under sparsity assumptions. A number of applications to synthetic and real datasets are worked out.

Suggested Citation

  • Azadkia, Mona & Chatterjee, Sourav, 2021. "A simple measure of conditional dependence," LSE Research Online Documents on Economics 125584, London School of Economics and Political Science, LSE Library.
  • Handle: RePEc:ehl:lserod:125584
    as

    Download full text from publisher

    File URL: http://eprints.lse.ac.uk/125584/
    File Function: Open access version.
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    2. Su, Liangjun & White, Halbert, 2007. "A consistent characteristic function-based test for conditional independence," Journal of Econometrics, Elsevier, vol. 141(2), pages 807-834, December.
    3. Su, Liangjun & White, Halbert, 2008. "A Nonparametric Hellinger Metric Test For Conditional Independence," Econometric Theory, Cambridge University Press, vol. 24(4), pages 829-864, August.
    4. Holger Dette & Karl F. Siburg & Pavel A. Stoimenov, 2013. "A Copula-Based Non-parametric Measure of Regression Dependence," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 40(1), pages 21-41, March.
    5. Su, Liangjun & White, Halbert, 2014. "Testing conditional independence via empirical likelihood," Journal of Econometrics, Elsevier, vol. 182(1), pages 27-44.
    6. Emmanuel Candès & Yingying Fan & Lucas Janson & Jinchi Lv, 2018. "Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(3), pages 551-577, June.
    7. Pradeep Ravikumar & John Lafferty & Han Liu & Larry Wasserman, 2009. "Sparse additive models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(5), pages 1009-1030, November.
    8. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    9. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    10. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    11. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    12. Noël Veraverbeke & Marek Omelka & Irène Gijbels, 2011. "Estimation of a Conditional Copula and Association Measures," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 38(4), pages 766-780, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Keyao Wang & Huiwen Wang & Shanshan Wang & Lihong Wang, 2024. "Variable selection for multivariate functional data via conditional correlation learning," Computational Statistics, Springer, vol. 39(4), pages 2375-2412, June.
    2. Ansari Jonathan & Rockel Marcus, 2024. "Dependence properties of bivariate copula families," Dependence Modeling, De Gruyter, vol. 12(1), pages 1-36.
    3. De Keyser, Steven & Gijbels, Irène, 2025. "High-dimensional copula-based Wasserstein dependence," Computational Statistics & Data Analysis, Elsevier, vol. 204(C).
    4. Yu Yang & Sthitie Bom & Xiaotong Shen, 2024. "A hierarchical ensemble causal structure learning approach for wafer manufacturing," Journal of Intelligent Manufacturing, Springer, vol. 35(6), pages 2961-2978, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. McKay Curtis, S. & Banerjee, Sayantan & Ghosal, Subhashis, 2014. "Fast Bayesian model assessment for nonparametric additive regression," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 347-358.
    2. Jiang, Liewen & Bondell, Howard D. & Wang, Huixia Judy, 2014. "Interquantile shrinkage and variable selection in quantile regression," Computational Statistics & Data Analysis, Elsevier, vol. 69(C), pages 208-219.
    3. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    4. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    5. Capanu, Marinela & Giurcanu, Mihai & Begg, Colin B. & Gönen, Mithat, 2023. "Subsampling based variable selection for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 184(C).
    6. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    7. Zhang, Tonglin, 2024. "Variables selection using L0 penalty," Computational Statistics & Data Analysis, Elsevier, vol. 190(C).
    8. Takumi Saegusa & Tianzhou Ma & Gang Li & Ying Qing Chen & Mei-Ling Ting Lee, 2020. "Variable Selection in Threshold Regression Model with Applications to HIV Drug Adherence Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(3), pages 376-398, December.
    9. Zanhua Yin, 2020. "Variable selection for sparse logistic regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(7), pages 821-836, October.
    10. Qingliang Fan & Yaqian Wu, 2020. "Endogenous Treatment Effect Estimation with some Invalid and Irrelevant Instruments," Papers 2006.14998, arXiv.org.
    11. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    12. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    13. Yanfang Zhang & Chuanhua Wei & Xiaolin Liu, 2022. "Group Logistic Regression Models with l p,q Regularization," Mathematics, MDPI, vol. 10(13), pages 1-15, June.
    14. Liming Wang & Xingxiang Li & Xiaoqing Wang & Peng Lai, 2022. "Unified mean-variance feature screening for ultrahigh-dimensional regression," Computational Statistics, Springer, vol. 37(4), pages 1887-1918, September.
    15. Zhang, Yan-Qing & Tian, Guo-Liang & Tang, Nian-Sheng, 2016. "Latent variable selection in structural equation models," Journal of Multivariate Analysis, Elsevier, vol. 152(C), pages 190-205.
    16. Tanin Sirimongkolkasem & Reza Drikvandi, 2019. "On Regularisation Methods for Analysis of High Dimensional Data," Annals of Data Science, Springer, vol. 6(4), pages 737-763, December.
    17. Yanming Li & Bin Nan & Ji Zhu, 2015. "Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure," Biometrics, The International Biometric Society, vol. 71(2), pages 354-363, June.
    18. Fang, Xiaolei & Paynabar, Kamran & Gebraeel, Nagi, 2017. "Multistream sensor fusion-based prognostics model for systems with single failure modes," Reliability Engineering and System Safety, Elsevier, vol. 159(C), pages 322-331.
    19. Mogliani, Matteo & Simoni, Anna, 2021. "Bayesian MIDAS penalized regressions: Estimation, selection, and prediction," Journal of Econometrics, Elsevier, vol. 222(1), pages 833-860.
    20. Patric Müller & Sara Geer, 2016. "Censored linear model in high dimensions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(1), pages 75-92, March.

    More about this item

    Keywords

    conditional dependence; non-parametric measures of association; variable selection;
    All these keywords.

    JEL classification:

    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ehl:lserod:125584. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: LSERO Manager (email available below). General contact details of provider: https://edirc.repec.org/data/lsepsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.