IDEAS home Printed from https://ideas.repec.org/a/spr/testjl/v33y2024i4d10.1007_s11749-024-00934-w.html
   My bibliography  Save this article

Conformal link prediction for false discovery rate control

Author

Listed:
  • Ariane Marandon

    (Sorbonne Université)

Abstract

Most link prediction methods return estimates of the connection probability of missing edges in a graph. Such output can be used to rank the missing edges from most to least likely to be a true edge, but does not directly provide a classification into true and nonexistent. In this work, we consider the problem of identifying a set of true edges with a control of the false discovery rate (FDR). We propose a novel method based on high-level ideas from the literature on conformal inference. The graph structure induces intricate dependence in the data, which we carefully take into account, as this makes the setup different from the usual setup in conformal inference, where data exchangeability is assumed. The FDR control is empirically demonstrated for both simulated and real data.

Suggested Citation

  • Ariane Marandon, 2024. "Conformal link prediction for false discovery rate control," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 33(4), pages 1062-1083, December.
  • Handle: RePEc:spr:testjl:v:33:y:2024:i:4:d:10.1007_s11749-024-00934-w
    DOI: 10.1007/s11749-024-00934-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11749-024-00934-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11749-024-00934-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Aaron Clauset & Cristopher Moore & M. E. J. Newman, 2008. "Hierarchical structure and the prediction of missing links in networks," Nature, Nature, vol. 453(7191), pages 98-101, May.
    2. Gaucher, Solenne & Klopp, Olga & Robin, Geneviève, 2021. "Outlier detection in networks with missing links," Computational Statistics & Data Analysis, Elsevier, vol. 164(C).
    3. Sun, Wenguang & Cai, T. Tony, 2007. "Oracle and Adaptive Compound Decision Rules for False Discovery Rate Control," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 901-912, September.
    4. Lü, Linyuan & Zhou, Tao, 2011. "Link prediction in complex networks: A survey," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 390(6), pages 1150-1170.
    5. John D. Storey & Jonathan E. Taylor & David Siegmund, 2004. "Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(1), pages 187-205, February.
    6. Timothée Tabouy & Pierre Barbillon & Julien Chiquet, 2020. "Variational Inference for Stochastic Block Models From Sampled Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(529), pages 455-466, January.
    7. Jing Lei & Larry Wasserman, 2014. "Distribution-free prediction bands for non-parametric regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 71-96, January.
    8. Mauricio Sadinle & Jing Lei & Larry Wasserman, 2019. "Least Ambiguous Set-Valued Classifiers With Bounded Error Levels," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(525), pages 223-234, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wang, Dan & Zhou, Xiao & Zhao, Pengwei & Pang, Juan & Ren, Qiaoyang, 2025. "Early identification of breakthrough technologies: Insights from science-driven innovations," Journal of Informetrics, Elsevier, vol. 19(1).
    2. Ghosh Debashis, 2012. "Incorporating the Empirical Null Hypothesis into the Benjamini-Hochberg Procedure," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(4), pages 1-21, July.
    3. Liu, Chuang & Zhou, Wei-Xing, 2012. "Heterogeneity in initial resource configurations improves a network-based hybrid recommendation algorithm," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 391(22), pages 5704-5711.
    4. Xia, Yongxiang & Pang, Wenbo & Zhang, Xuejun, 2021. "Mining relationships between performance of link prediction algorithms and network structure," Chaos, Solitons & Fractals, Elsevier, vol. 153(P2).
    5. Aslan, Serpil & Kaya, Buket & Kaya, Mehmet, 2019. "Predicting potential links by using strengthened projections in evolving bipartite networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 525(C), pages 998-1011.
    6. Leto Peel & Tiago P. Peixoto & Manlio De Domenico, 2022. "Statistical inference links data and theory in network science," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    7. Xiaoquan Wen, 2017. "Robust Bayesian FDR Control Using Bayes Factors, with Applications to Multi-tissue eQTL Discovery," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(1), pages 28-49, June.
    8. Joshua Habiger & David Watts & Michael Anderson, 2017. "Multiple testing with heterogeneous multinomial distributions," Biometrics, The International Biometric Society, vol. 73(2), pages 562-570, June.
    9. Wang, Zuxi & Wu, Yao & Li, Qingguang & Jin, Fengdong & Xiong, Wei, 2016. "Link prediction based on hyperbolic mapping with community structure for complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 450(C), pages 609-623.
    10. Lee, Yan-Li & Zhou, Tao, 2021. "Collaborative filtering approach to link prediction," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 578(C).
    11. Alejandro Ochoa & John D Storey & Manuel Llinás & Mona Singh, 2015. "Beyond the E-Value: Stratified Statistics for Protein Domain Prediction," PLOS Computational Biology, Public Library of Science, vol. 11(11), pages 1-21, November.
    12. Peng Liu & Liang Gui & Huirong Wang & Muhammad Riaz, 2022. "A Two-Stage Deep-Learning Model for Link Prediction Based on Network Structure and Node Attributes," Sustainability, MDPI, vol. 14(23), pages 1-15, December.
    13. Yao, Can-Zhong & Lin, Ji-Nan & Zheng, Xu-Zhou & Liu, Xiao-Feng, 2015. "The study of RMB exchange rate complex networks based on fluctuation mode," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 436(C), pages 359-376.
    14. Liu, Zhenfeng & Feng, Jian & Uden, Lorna, 2023. "Technology opportunity analysis using hierarchical semantic networks and dual link prediction," Technovation, Elsevier, vol. 128(C).
    15. Zhang, Xue & Wang, Xiaojie & Zhao, Chengli & Yi, Dongyun & Xie, Zheng, 2014. "Degree-corrected stochastic block models and reliability in networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 393(C), pages 553-559.
    16. Chi, Kuo & Qu, Hui & Yin, Guisheng, 2022. "Link prediction for existing links in dynamic networks based on the attraction force," Chaos, Solitons & Fractals, Elsevier, vol. 159(C).
    17. Zhou, Tao & Lee, Yan-Li & Wang, Guannan, 2021. "Experimental analyses on 2-hop-based and 3-hop-based link prediction algorithms," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 564(C).
    18. Dennis Leung & Wenguang Sun, 2022. "ZAP: Z$$ Z $$‐value adaptive procedures for false discovery rate control with side information," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(5), pages 1886-1946, November.
    19. Shang, Ke-ke & Small, Michael & Yan, Wei-sheng, 2017. "Fitness networks for real world systems via modified preferential attachment," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 474(C), pages 49-60.
    20. T. Tony Cai & Wenguang Sun, 2017. "Optimal screening and discovery of sparse signals with applications to multistage high throughput studies," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(1), pages 197-223, January.

    More about this item

    Keywords

    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:testjl:v:33:y:2024:i:4:d:10.1007_s11749-024-00934-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.