IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0282699.html
   My bibliography  Save this article

Redundancy-aware unsupervised ranking based on game theory: Ranking pathways in collections of gene sets

Author

Listed:
  • Chiara Balestra
  • Carlo Maj
  • Emmanuel Müller
  • Andreas Mayr

Abstract

In Genetics, gene sets are grouped in collections concerning their biological function. This often leads to high-dimensional, overlapping, and redundant families of sets, thus precluding a straightforward interpretation of their biological meaning. In Data Mining, it is often argued that techniques to reduce the dimensionality of data could increase the maneuverability and consequently the interpretability of large data. In the past years, moreover, we witnessed an increasing consciousness of the importance of understanding data and interpretable models in the machine learning and bioinformatics communities. On the one hand, there exist techniques aiming to aggregate overlapping gene sets to create larger pathways. While these methods could partly solve the large size of the collections’ problem, modifying biological pathways is hardly justifiable in this biological context. On the other hand, the representation methods to increase interpretability of collections of gene sets that have been proposed so far have proved to be insufficient. Inspired by this Bioinformatics context, we propose a method to rank sets within a family of sets based on the distribution of the singletons and their size. We obtain sets’ importance scores by computing Shapley values; Making use of microarray games, we do not incur the typical exponential computational complexity. Moreover, we address the challenge of constructing redundancy-aware rankings where, in our case, redundancy is a quantity proportional to the size of intersections among the sets in the collections. We use the obtained rankings to reduce the dimension of the families, therefore showing lower redundancy among sets while still preserving a high coverage of their elements. We finally evaluate our approach for collections of gene sets and apply Gene Sets Enrichment Analysis techniques to the now smaller collections: As expected, the unsupervised nature of the proposed rankings allows for unremarkable differences in the number of significant gene sets for specific phenotypic traits. In contrast, the number of performed statistical tests can be drastically reduced. The proposed rankings show a practical utility in bioinformatics to increase interpretability of the collections of gene sets and a step forward to include redundancy-awareness into Shapley values computations.

Suggested Citation

  • Chiara Balestra & Carlo Maj & Emmanuel Müller & Andreas Mayr, 2023. "Redundancy-aware unsupervised ranking based on game theory: Ranking pathways in collections of gene sets," PLOS ONE, Public Library of Science, vol. 18(3), pages 1-17, March.
  • Handle: RePEc:plo:pone00:0282699
    DOI: 10.1371/journal.pone.0282699
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0282699
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0282699&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0282699?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Stefano Moretti & Fioravante Patrone & Stefano Bonassi, 2007. "The class of microarray games and the relevance index for genes," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 15(2), pages 256-280, December.
    2. Celia Fontanillo & Ruben Nogales-Cadenas & Alberto Pascual-Montano & Javier De Las Rivas, 2011. "Functional Analysis beyond Enrichment: Non-Redundant Reciprocal Linkage of Genes and Biological Terms," PLOS ONE, Public Library of Science, vol. 6(9), pages 1-10, September.
    3. Shinichi Nakagawa, 2004. "A farewell to Bonferroni: the problems of low statistical power and publication bias," Behavioral Ecology, International Society for Behavioral Ecology, vol. 15(6), pages 1044-1045, November.
    4. Antigoni Elefsinioti & Marit Ackermann & Andreas Beyer, 2009. "Accounting for Redundancy when Integrating Gene Interaction Databases," PLOS ONE, Public Library of Science, vol. 4(10), pages 1-9, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ricardo Martínez & Joaquín Sánchez-Soriano, 2021. "Social solidarity with dummies in the museum pass problem," ThE Papers 21/11, Department of Economic Theory and Economic History of the University of Granada..
    2. R. C. Rodríguez-Caro & E. Graciá & S. P. Blomberg & H. Cayuela & M. Grace & C. P. Carmona & H. A. Pérez-Mendoza & A. Giménez & R. Salguero-Gómez, 2023. "Anthropogenic impacts on threatened species erode functional diversity in chelonians and crocodilians," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    3. Felix Drinkall & Stefan Zohren & Michael McMahon & Janet B. Pierrehumbert, 2025. "Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups," Papers 2502.14497, arXiv.org.
    4. Sofie Boterberg & Elise Vantroys & Boel De Paepe & Rudy Van Coster & Herbert Roeyers, 2022. "Urine lactate concentration as a non-invasive screener for metabolic abnormalities: Findings in children with autism spectrum disorder and regression," PLOS ONE, Public Library of Science, vol. 17(9), pages 1-23, September.
    5. Stefano Moretti & Fioravante Patrone, 2008. "Transversality of the Shapley value," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 16(1), pages 1-41, July.
    6. Chuanhua Xing & David B Dunson, 2011. "Bayesian Inference for Genomic Data Integration Reduces Misclassification Rate in Predicting Protein-Protein Interactions," PLOS Computational Biology, Public Library of Science, vol. 7(7), pages 1-10, July.
    7. Tilov, Ivan & Weber, Sylvain, 2023. "Heterogeneity in price elasticity of vehicle kilometers traveled: Evidence from micro-level panel data," Energy Economics, Elsevier, vol. 127(PA).
    8. Steven Bednar & Kathryn Rouse, 2020. "The effect of physical education on children's body weight and human capital: New evidence from the ECLS‐K:2011," Health Economics, John Wiley & Sons, Ltd., vol. 29(4), pages 393-405, April.
    9. Ray, Tridip & Roy Chaudhuri, Arka & Sahai, Komal, 2020. "Whose education matters? An analysis of inter caste marriages in India," Journal of Economic Behavior & Organization, Elsevier, vol. 176(C), pages 619-633.
    10. Ryan J. McGill, 2017. "Re(Examining) Relations between CHC Broad and Narrow Cognitive Abilities and Reading Achievement," Journal of Educational and Developmental Psychology, Canadian Center of Science and Education, vol. 7(1), pages 265-265, May.
    11. Burke, Mary A. & Fournier, Gary M. & Prasad, Kislaya, 2010. "Geographic variations in a model of physician treatment choice with social interactions," Journal of Economic Behavior & Organization, Elsevier, vol. 73(3), pages 418-432, March.
    12. Hamers, Herbert & Husslage, Bart & Lindelauf, R. & Campen, Tjeerd, 2016. "A New Approximation Method for the Shapley Value Applied to the WTC 9/11 Terrorist Attack," Other publications TiSEM 8a67b416-1091-4efe-a1a6-7, Tilburg University, School of Economics and Management.
    13. Giulia Bernardi & Roberto Lucchetti, 2015. "Generating Semivalues via Unanimity Games," Journal of Optimization Theory and Applications, Springer, vol. 166(3), pages 1051-1062, September.
    14. Shiwang Yu & Na Guo & Caimiao Zheng & Yu Song & Jianli Hao, 2021. "Investigating the Association between Outdoor Environment and Outdoor Activities for Seniors Living in Old Residential Communities," IJERPH, MDPI, vol. 18(14), pages 1-16, July.
    15. repec:plo:pone00:0061048 is not listed on IDEAS
    16. Hannah Fraser & Tim Parker & Shinichi Nakagawa & Ashley Barnett & Fiona Fidler, 2018. "Questionable research practices in ecology and evolution," PLOS ONE, Public Library of Science, vol. 13(7), pages 1-16, July.
    17. Christophe Schalck & Meryem Yankol-Schalck, 2021. "Predicting French SME failures: new evidence from machine learning techniques," Applied Economics, Taylor & Francis Journals, vol. 53(51), pages 5948-5963, November.
    18. Dora Gyori & Bernadett Frida Farkas & Lili Olga Horvath & Daniel Komaromy & Gergely Meszaros & Dora Szentivanyi & Judit Balazs, 2021. "The Association of Nonsuicidal Self-Injury with Quality of Life and Mental Disorders in Clinical Adolescents—A Network Approach," IJERPH, MDPI, vol. 18(4), pages 1-21, February.
    19. Djankov, Simeon & Nikolova, Elena, 2018. "Communism as the unhappy coming," Journal of Comparative Economics, Elsevier, vol. 46(3), pages 708-721.
    20. Tom C. van der Zanden & Hans L. Bodlaender & Herbert J. M. Hamers, 2023. "Efficiently computing the Shapley value of connectivity games in low-treewidth graphs," Operational Research, Springer, vol. 23(1), pages 1-23, March.
    21. Ricardo Martínez & Joaquín Sánchez-Soriano, 2021. "Mathematical indices for the influence of risk factors on the lethality of a disease," ThE Papers 21/02, Department of Economic Theory and Economic History of the University of Granada..

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0282699. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.