IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-33026-0.html
   My bibliography  Save this article

Integrating and formatting biomedical data as pre-calculated knowledge graph embeddings in the Bioteque

Author

Listed:
  • Adrià Fernández-Torras

    (The Barcelona Institute of Science and Technology)

  • Miquel Duran-Frigola

    (The Barcelona Institute of Science and Technology
    Ersilia Open Source Initiative)

  • Martino Bertoni

    (The Barcelona Institute of Science and Technology)

  • Martina Locatelli

    (The Barcelona Institute of Science and Technology)

  • Patrick Aloy

    (The Barcelona Institute of Science and Technology
    Institució Catalana de Recerca i Estudis Avançats (ICREA))

Abstract

Biomedical data is accumulating at a fast pace and integrating it into a unified framework is a major challenge, so that multiple views of a given biological event can be considered simultaneously. Here we present the Bioteque, a resource of unprecedented size and scope that contains pre-calculated biomedical descriptors derived from a gigantic knowledge graph, displaying more than 450 thousand biological entities and 30 million relationships between them. The Bioteque integrates, harmonizes, and formats data collected from over 150 data sources, including 12 biological entities (e.g., genes, diseases, drugs) linked by 67 types of associations (e.g., ‘drug treats disease’, ‘gene interacts with gene’). We show how Bioteque descriptors facilitate the assessment of high-throughput protein-protein interactome data, the prediction of drug response and new repurposing opportunities, and demonstrate that they can be used off-the-shelf in downstream machine learning tasks without loss of performance with respect to using original data. The Bioteque thus offers a thoroughly processed, tractable, and highly optimized assembly of the biomedical knowledge available in the public domain.

Suggested Citation

  • Adrià Fernández-Torras & Miquel Duran-Frigola & Martino Bertoni & Martina Locatelli & Patrick Aloy, 2022. "Integrating and formatting biomedical data as pre-calculated knowledge graph embeddings in the Bioteque," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-33026-0
    DOI: 10.1038/s41467-022-33026-0
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-33026-0
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-33026-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Erdogan Taskesen & Marcel J T Reinders, 2016. "2D Representation of Transcriptomes by t-SNE Exposes Relatedness between Human Tissues," PLOS ONE, Public Library of Science, vol. 11(2), pages 1-6, February.
    2. Laura Cantini & Pooya Zakeri & Celine Hernandez & Aurelien Naldi & Denis Thieffry & Elisabeth Remy & Anaïs Baudot, 2021. "Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    3. Anna C. Belkina & Christopher O. Ciccolella & Rina Anno & Richard Halpert & Josef Spidlen & Jennifer E. Snyder-Cappione, 2019. "Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets," Nature Communications, Nature, vol. 10(1), pages 1-12, December.
    4. Katja Luck & Dae-Kyum Kim & Luke Lambourne & Kerstin Spirohn & Bridget E. Begg & Wenting Bian & Ruth Brignall & Tiziana Cafarelli & Francisco J. Campos-Laborie & Benoit Charloteaux & Dongsic Choi & At, 2020. "A reference map of the human binary protein interactome," Nature, Nature, vol. 580(7803), pages 402-408, April.
    5. Jordi Barretina & Giordano Caponigro & Nicolas Stransky & Kavitha Venkatesan & Adam A. Margolin & Sungjoon Kim & Christopher J.Wilson & Joseph Lehár & Gregory V. Kryukov & Dmitriy Sonkin & Anupama Red, 2012. "Addendum: The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity," Nature, Nature, vol. 492(7428), pages 290-290, December.
    6. Yue Qin & Edward L. Huttlin & Casper F. Winsnes & Maya L. Gosztyla & Ludivine Wacheul & Marcus R. Kelly & Steven M. Blue & Fan Zheng & Michael Chen & Leah V. Schaffer & Katherine Licon & Anna Bäckströ, 2021. "A multi-scale map of cell structure fusing protein images and interactions," Nature, Nature, vol. 600(7889), pages 536-542, December.
    7. Mahmoud Ghandi & Franklin W. Huang & Judit Jané-Valbuena & Gregory V. Kryukov & Christopher C. Lo & E. Robert McDonald & Jordi Barretina & Ellen T. Gelfand & Craig M. Bielski & Haoxin Li & Kevin Hu & , 2019. "Next-generation characterization of the Cancer Cell Line Encyclopedia," Nature, Nature, vol. 569(7757), pages 503-508, May.
    8. Jordi Barretina & Giordano Caponigro & Nicolas Stransky & Kavitha Venkatesan & Adam A. Margolin & Sungjoon Kim & Christopher J. Wilson & Joseph Lehár & Gregory V. Kryukov & Dmitriy Sonkin & Anupama Re, 2012. "The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity," Nature, Nature, vol. 483(7391), pages 603-607, March.
    9. Camilo Ruiz & Marinka Zitnik & Jure Leskovec, 2021. "Identification of disease treatment mechanisms through the multiscale interactome," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    10. Benjamin Haibe-Kains & Nehme El-Hachem & Nicolai Juul Birkbak & Andrew C. Jin & Andrew H. Beck & Hugo J. W. L. Aerts & John Quackenbush, 2013. "Inconsistency in large pharmacogenomic studies," Nature, Nature, vol. 504(7480), pages 389-393, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Caitlin E. Mills & Kartik Subramanian & Marc Hafner & Mario Niepel & Luca Gerosa & Mirra Chung & Chiara Victor & Benjamin Gaudio & Clarence Yapp & Ajit J. Nirmal & Nicholas Clark & Peter K. Sorger, 2022. "Multiplexed and reproducible high content screening of live and fixed cells using Dye Drop," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    2. Yanli Liu & Zhong Wu & Jin Zhou & Dinesh K. A. Ramadurai & Katelyn L. Mortenson & Estrella Aguilera-Jimenez & Yifei Yan & Xiaojun Yang & Alison M. Taylor & Katherine E. Varley & Jason Gertz & Peter S., 2021. "A predominant enhancer co-amplified with the SOX2 oncogene is necessary and sufficient for its expression in squamous cancer," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    3. Xiao-Song Wang & Sanghoon Lee & Han Zhang & Gong Tang & Yue Wang, 2022. "An integral genomic signature approach for tailored cancer therapy using genome-wide sequencing data," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    4. Jurica Levatić & Marina Salvadores & Francisco Fuster-Tormo & Fran Supek, 2022. "Mutational signatures are markers of drug sensitivity of cancer cells," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    5. Kelsy C. Cotto & Yang-Yang Feng & Avinash Ramu & Megan Richters & Sharon L. Freshour & Zachary L. Skidmore & Huiming Xia & Joshua F. McMichael & Jason Kunisaki & Katie M. Campbell & Timothy Hung-Po Ch, 2023. "Integrated analysis of genomic and transcriptomic data for the discovery of splice-associated variants in cancer," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    6. Han Jin & Cheng Zhang & Martin Zwahlen & Kalle Feilitzen & Max Karlsson & Mengnan Shi & Meng Yuan & Xiya Song & Xiangyu Li & Hong Yang & Hasan Turkez & Linn Fagerberg & Mathias Uhlén & Adil Mardinoglu, 2023. "Systematic transcriptional analysis of human cell lines for gene expression landscape and tumor representation," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    7. Seungyeul Yoo & Abhilasha Sinha & Dawei Yang & Nasser K. Altorki & Radhika Tandon & Wenhui Wang & Deebly Chavez & Eunjee Lee & Ayushi S. Patel & Takashi Sato & Ranran Kong & Bisen Ding & Eric E. Schad, 2022. "Integrative network analysis of early-stage lung adenocarcinoma identifies aurora kinase inhibition as interceptor of invasion and progression," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    8. Alon Stern & Mariam Fokra & Boris Sarvin & Ahmad Abed Alrahem & Won Dong Lee & Elina Aizenshtein & Nikita Sarvin & Tomer Shlomi, 2023. "Inferring mitochondrial and cytosolic metabolism by coupling isotope tracing and deconvolution," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    9. Sayantani Ghosh Dastidar & Bony Kumar & Bo Lauckner & Damien Parrello & Danielle Perley & Maria Vlasenok & Antariksh Tyagi & Nii Koney-Kwaku Koney & Ata Abbas & Sergei Nechaev, 2023. "Transcriptional responses of cancer cells to heat shock-inducing stimuli involve amplification of robust HSF1 binding," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    10. Sumana Srivatsa & Hesam Montazeri & Gaia Bianco & Mairene Coto-Llerena & Mattia Marinucci & Charlotte K. Y. Ng & Salvatore Piscuoglio & Niko Beerenwinkel, 2022. "Discovery of synthetic lethal interactions from large-scale pan-cancer perturbation screens," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    11. Cemal Erdem & Sean M. Gross & Laura M. Heiser & Marc R. Birtwistle, 2023. "MOBILE pipeline enables identification of context-specific networks and regulatory mechanisms," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    12. Guidantonio Malagoli Tagliazucchi & Anna J. Wiecek & Eloise Withnell & Maria Secrier, 2023. "Genomic and microenvironmental heterogeneity shaping epithelial-to-mesenchymal trajectories in cancer," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    13. Philip East & Gavin P. Kelly & Dhruva Biswas & Michela Marani & David C. Hancock & Todd Creasy & Kris Sachsenmeier & Charles Swanton & Julian Downward & Sophie de Carné Trécesson, 2022. "RAS oncogenic activity predicts response to chemotherapy and outcome in lung adenocarcinoma," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    14. Caterina Bartolacci & Cristina Andreani & Gonçalo Vale & Stefano Berto & Margherita Melegari & Anna Colleen Crouch & Dodge L. Baluya & George Kemble & Kurt Hodges & Jacqueline Starrett & Katerina Poli, 2022. "Targeting de novo lipogenesis and the Lands cycle induces ferroptosis in KRAS-mutant lung cancer," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    15. Sanju Sinha & Karina Barbosa & Kuoyuan Cheng & Mark D. M. Leiserson & Prashant Jain & Anagha Deshpande & David M. Wilson & Bríd M. Ryan & Ji Luo & Ze’ev A. Ronai & Joo Sang Lee & Aniruddha J. Deshpand, 2021. "A systematic genome-wide mapping of oncogenic mutation selection during CRISPR-Cas9 genome editing," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    16. Zheqi Li & Olivia McGinn & Yang Wu & Amir Bahreini & Nolan M. Priedigkeit & Kai Ding & Sayali Onkar & Caleb Lampenfeld & Carol A. Sartorius & Lori Miller & Margaret Rosenzweig & Ofir Cohen & Nikhil Wa, 2022. "ESR1 mutant breast cancers show elevated basal cytokeratins and immune activation," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    17. Bingzhen Chen & Wenjuan Zhai & Lingchen Kong, 2022. "Variable selection and collinearity processing for multivariate data via row-elastic-net regularization," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 106(1), pages 79-96, March.
    18. Hao Wang & R. Alejandro Sica & Gurbakhash Kaur & Phillip M. Galbo & Zhixin Jing & Christopher D. Nishimura & Xiaoxin Ren & Ankit Tanwar & Bijan Etemad-Gilbertson & Britta Will & Deyou Zheng & David Fo, 2024. "TMIGD2 is an orchestrator and therapeutic target on human acute myeloid leukemia stem cells," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    19. Ozge Saatci & Metin Cetin & Meral Uner & Unal Metin Tokat & Ioulia Chatzistamou & Pelin Gulizar Ersan & Elodie Montaudon & Aytekin Akyol & Sercan Aksoy & Aysegul Uner & Elisabetta Marangoni & Mathew S, 2023. "Toxic PARP trapping upon cAMP-induced DNA damage reinstates the efficacy of endocrine therapy and CDK4/6 inhibitors in treatment-refractory ER+ breast cancer," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    20. Tanaz Sharifnia & Mathias J. Wawer & Amy Goodale & Yenarae Lee & Mariya Kazachkova & Joshua M. Dempster & Sandrine Muller & Joan Levy & Daniel M. Freed & Josh Sommer & Jérémie Kalfon & Francisca Vazqu, 2023. "Mapping the landscape of genetic dependencies in chordoma," Nature Communications, Nature, vol. 14(1), pages 1-17, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-33026-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.