IDEAS home Printed from https://ideas.repec.org/a/spr/testjl/v32y2023i4d10.1007_s11749-023-00878-7.html
   My bibliography  Save this article

Analysis of conditional randomisation and permutation schemes with application to conditional independence testing

Author

Listed:
  • Małgorzata Łazȩcka

    (Warsaw University of Technology
    Polish Academy of Sciences
    University of Warsaw)

  • Bartosz Kołodziejek

    (Warsaw University of Technology)

  • Jan Mielniczuk

    (Warsaw University of Technology
    Polish Academy of Sciences)

Abstract

We study properties of two resampling scenarios: Conditional Randomisation and Conditional Permutation schemes, which are relevant for testing conditional independence of discrete random variables X and Y given a random variable Z. Namely, we investigate asymptotic behaviour of estimates of a vector of probabilities in such settings, establish their asymptotic normality and ordering between asymptotic covariance matrices. The results are used to derive asymptotic distributions of the empirical Conditional Mutual Information in those set-ups. Somewhat unexpectedly, the distributions coincide for the two scenarios, despite differences in the asymptotic distributions of the estimates of probabilities. We also prove validity of permutation p-values for the Conditional Permutation scheme. The above results justify consideration of conditional independence tests based on resampled p-values and on the asymptotic chi-square distribution with an adjusted number of degrees of freedom. We show in numerical experiments that when the ratio of the sample size to the number of possible values of the triple exceeds 0.5, the test based on the asymptotic distribution with the adjustment made on a limited number of permutations is a viable alternative to the exact test for both the Conditional Permutation and the Conditional Randomisation scenarios. Moreover, there is no significant difference between the performance of exact tests for Conditional Permutation and Randomisation schemes, the latter requiring knowledge of conditional distribution of X given Z, and the same conclusion is true for both adaptive tests.

Suggested Citation

  • Małgorzata Łazȩcka & Bartosz Kołodziejek & Jan Mielniczuk, 2023. "Analysis of conditional randomisation and permutation schemes with application to conditional independence testing," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(4), pages 1459-1478, December.
  • Handle: RePEc:spr:testjl:v:32:y:2023:i:4:d:10.1007_s11749-023-00878-7
    DOI: 10.1007/s11749-023-00878-7
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11749-023-00878-7
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11749-023-00878-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Thomas B. Berrett & Yi Wang & Rina Foygel Barber & Richard J. Samworth, 2020. "The conditional permutation test for independence while controlling for confounders," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(1), pages 175-197, February.
    2. Emmanuel Candès & Yingying Fan & Lucas Janson & Jinchi Lv, 2018. "Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(3), pages 551-577, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pedro Delicado & Daniel Peña, 2023. "Understanding complex predictive models with ghost variables," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(1), pages 107-145, March.
    2. Dong, Yan & Li, Daoji & Zheng, Zemin & Zhou, Jia, 2022. "Reproducible feature selection in high-dimensional accelerated failure time models," Statistics & Probability Letters, Elsevier, vol. 181(C).
    3. Emmanuel Candès & Chiara Sabatti, 2020. "Discussion of the Paper “Prediction, Estimation, and Attribution” by B. Efron," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 60-63, December.
    4. Baihua He & Di Xia & Yingli Pan, 2024. "High dimensional controlled variable selection with model-X knockoffs in the AFT model," Computational Statistics, Springer, vol. 39(4), pages 1993-2009, June.
    5. Zihuai He & Linxi Liu & Michael E. Belloy & Yann Guen & Aaron Sossin & Xiaoxia Liu & Xinran Qi & Shiyang Ma & Prashnna K. Gyawali & Tony Wyss-Coray & Hua Tang & Chiara Sabatti & Emmanuel Candès & Mich, 2022. "GhostKnockoff inference empowers identification of putative causal variants in genome-wide association studies," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    6. Rajchert, Andrew & Keich, Uri, 2023. "Controlling the false discovery rate via competition: Is the +1 needed?," Statistics & Probability Letters, Elsevier, vol. 197(C).
    7. Yumei Ren & Guoqiang Tang & Xin Li & Xuchang Chen, 2023. "A Study of Multifactor Quantitative Stock-Selection Strategies Incorporating Knockoff and Elastic Net-Logistic Regression," Mathematics, MDPI, vol. 11(16), pages 1-20, August.
    8. Yi Liu & Veronika Ročková & Yuexi Wang, 2021. "Variable selection with ABC Bayesian forests," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(3), pages 453-481, July.
    9. Laura Freijeiro‐González & Manuel Febrero‐Bande & Wenceslao González‐Manteiga, 2022. "A Critical Review of LASSO and Its Derivatives for Variable Selection Under Dependence Among Covariates," International Statistical Review, International Statistical Institute, vol. 90(1), pages 118-145, April.
    10. Xie, Zilong & Chen, Yunxiao & von Davier, Matthias & Weng, Haolei, 2023. "Variable selection in latent variable models via knockoffs: an application to international large-scale assessment in education," LSE Research Online Documents on Economics 120812, London School of Economics and Political Science, LSE Library.
    11. Dae Woong Ham & Jiaze Qiu, 2023. "Hypothesis testing in adaptively sampled data: ART to maximize power beyond iid sampling," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(3), pages 998-1037, September.
    12. Jeng, X. Jessie & Chen, Xiongzhi, 2019. "Predictor ranking and false discovery proportion control in high-dimensional regression," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 163-175.
    13. Challet, Damien & Bongiorno, Christian & Pelletier, Guillaume, 2021. "Financial factors selection with knockoffs: Fund replication, explanatory and prediction networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 580(C).
    14. L Bottolo & S Richardson, 2019. "Discussion of ‘Gene hunting with hidden Markov model knockoffs’," Biometrika, Biometrika Trust, vol. 106(1), pages 19-22.
    15. Wen, Xin & Li, Yang & Zheng, Zemin, 2024. "Scalable efficient reproducible multi-task learning via data splitting," Statistics & Probability Letters, Elsevier, vol. 208(C).
    16. Emre Demirkaya & Yang Feng & Pallavi Basu & Jinchi Lv, 2022. "Large-scale model selection in misspecified generalized linear models [Information theory and an extension of the maximum likelihood principle]," Biometrika, Biometrika Trust, vol. 109(1), pages 123-136.
    17. Azadkia, Mona & Chatterjee, Sourav, 2021. "A simple measure of conditional dependence," LSE Research Online Documents on Economics 125584, London School of Economics and Political Science, LSE Library.
    18. Subhadeep Mukhopadhyay, 2021. "InfoGram and Admissible Machine Learning," Papers 2108.07380, arXiv.org, revised Aug 2021.
    19. D García Rasines & G A Young, 2023. "Splitting strategies for post-selection inference," Biometrika, Biometrika Trust, vol. 110(3), pages 597-614.
    20. Shi, Chengchun & Xu, Tianlin & Bergsma, Wicher & Li, Lexin, 2021. "Double generative adversarial networks for conditional independence testing," LSE Research Online Documents on Economics 112550, London School of Economics and Political Science, LSE Library.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:testjl:v:32:y:2023:i:4:d:10.1007_s11749-023-00878-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.