IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1010212.html
   My bibliography  Save this article

Dimensionality reduction of longitudinal ’omics data using modern tensor factorizations

Author

Listed:
  • Uria Mor
  • Yotam Cohen
  • Rafael Valdés-Mas
  • Denise Kviatcovsky
  • Eran Elinav
  • Haim Avron

Abstract

Longitudinal ’omics analytical methods are extensively used in the evolving field of precision medicine, by enabling ‘big data’ recording and high-resolution interpretation of complex datasets, driven by individual variations in response to perturbations such as disease pathogenesis, medical treatment or changes in lifestyle. However, inherent technical limitations in biomedical studies often result in the generation of feature-rich and sample-limited datasets. Analyzing such data using conventional modalities often proves to be challenging since the repeated, high-dimensional measurements overload the outlook with inconsequential variations that must be filtered from the data in order to find the true, biologically relevant signal. Tensor methods for the analysis and meaningful representation of multiway data may prove useful to the biological research community by their advertised ability to tackle this challenge. In this study, we present tcam—a new unsupervised tensor factorization method for the analysis of multiway data. Building on top of cutting-edge developments in the field of tensor-tensor algebra, we characterize the unique mathematical properties of our method, namely, 1) preservation of geometric and statistical traits of the data, which enable uncovering information beyond the inter-individual variation that often takes over the focus, especially in human studies. 2) Natural and straightforward out-of-sample extension, making tcam amenable for integration in machine learning workflows. A series of re-analyses of real-world, human experimental datasets showcase these theoretical properties, while providing empirical confirmation of tcam’s utility in the analysis of longitudinal ’omics data.Author summary: Tensor methods have proven useful for exploration of high-dimensional, multiway data that is produced in longitudinal ’omics studies. However, even the most recent applications of these methods to ’omics data are based on the canonical polyadic tensor-rank factorization whose results heavily depend on the choice of target rank, lack any guarantee for optimal approximation, and do not allow for out-of-sample extension in a straightforward manner. In this paper, we present a method for tensor component analysis for the analysis of longitudinal ’omics data, built on top of cutting-edge developments in the field of tensor-tensor algebra. We show that our method, in contrast to existing tensor-methods, enjoys provable optimal properties on the distortion and variance in the embedding space, enabling direct and meaningful interpretation, supporting traditional multivariate statistical analysis to be performed in the embedding space. Due to the method’s construction using tensor-tensor products, the procedure of mapping a point to the embedding space of a pre-trained factorization is simple and scalable, giving rise to the application of our method as a feature engineering step in standard machine learning workflows.

Suggested Citation

  • Uria Mor & Yotam Cohen & Rafael Valdés-Mas & Denise Kviatcovsky & Eran Elinav & Haim Avron, 2022. "Dimensionality reduction of longitudinal ’omics data using modern tensor factorizations," PLOS Computational Biology, Public Library of Science, vol. 18(7), pages 1-18, July.
  • Handle: RePEc:plo:pcbi00:1010212
    DOI: 10.1371/journal.pcbi.1010212
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010212
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1010212&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1010212?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. M. Reza Sailani & Ahmed A. Metwally & Wenyu Zhou & Sophia Miryam Schüssler-Fiorenza Rose & Sara Ahadi & Kevin Contrepois & Tejaswini Mishra & Martin Jinye Zhang & Łukasz Kidziński & Theodore J. Chu & , 2020. "Deep longitudinal multiomics profiling reveals two biological seasonal patterns in California," Nature Communications, Nature, vol. 11(1), pages 1-12, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Anna Halama & Shaza Zaghlool & Gaurav Thareja & Sara Kader & Wadha Al Muftah & Marjonneke Mook-Kanamori & Hina Sarwath & Yasmin Ali Mohamoud & Nisha Stephan & Sabine Ameling & Maja Pucic Baković & Jan, 2024. "A roadmap to the molecular human linking multiomics with population traits and diabetes subtypes," Nature Communications, Nature, vol. 15(1), pages 1-23, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1010212. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.