IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0339650.html

ACMTF-R: Supervised multi-omics data integration uncovering shared and distinct outcome-associated variation

Author

Listed:
  • Geert Roelof van der Ploeg
  • Fred T G White
  • Rasmus Riemer Jakobsen
  • Johan A Westerhuis
  • Anna Heintz-Buschart
  • Age K Smilde

Abstract

The rapid growth of high-dimensional biological data has necessitated advanced data fusion techniques to integrate and interpret complex multi-omics and longitudinal datasets. Shared and unshared structure across such datasets can be identified in an unsupervised manner with Advanced Coupled Matrix and Tensor Factorization (ACMTF), but this cannot be related to an outcome. Conversely, N-way Partial Least Squares (NPLS) is supervised and captures outcome-associated variation but cannot identify shared and unshared structure. To bridge the gap between data exploration and prediction, we introduce ACMTF-Regression (ACMTF-R), an extension of ACMTF that incorporates a regression step, allowing for the simultaneous decomposition of multi-way data while explicitly capturing variation associated with a dependent variable. We present a detailed mathematical formulation of ACMTF-R, including its optimisation algorithm and implementation. Through extensive simulations, we systematically evaluate its ability to recover a small y-related component shared between multiple blocks, its robustness to noise, and the impact of the tuning parameter (π) which controls the balance between data exploration and outcome prediction. Our results demonstrate that ACMTF-R can robustly identify the y-related component, correctly identifying outcome-associated shared and distinct variation, distinguishing it from existing approaches such as NPLS and ACMTF. The development of ACMTF-R was motivated by a real-world dataset investigating how maternal pre-pregnancy BMI affects the human milk microbiome, human milk metabolome, and infant faecal microbiome. Emerging evidence suggests that inter-generational transfer of maternal obesity may affect multiple omics layers, highlighting the need to identify outcome-associated variation. The applicability of ACMTF-R is therefore validated by applying it to this multi-omics dataset. ACMTF-R successfully identifies novel mother-infant relationships associated with maternal pre-pregnancy BMI, underscoring its utility in multi-omics research. Our findings establish ACMTF-R as a versatile tool for multi-way data fusion, offering new insights into complex biological systems by integrating common, local, and distinct variation in the context of a dependent variable.

Suggested Citation

  • Geert Roelof van der Ploeg & Fred T G White & Rasmus Riemer Jakobsen & Johan A Westerhuis & Anna Heintz-Buschart & Age K Smilde, 2026. "ACMTF-R: Supervised multi-omics data integration uncovering shared and distinct outcome-associated variation," PLOS ONE, Public Library of Science, vol. 21(1), pages 1-27, January.
  • Handle: RePEc:plo:pone00:0339650
    DOI: 10.1371/journal.pone.0339650
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0339650
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0339650&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0339650?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. H. W. Kuhn, 1956. "Variants of the hungarian method for assignment problems," Naval Research Logistics Quarterly, John Wiley & Sons, vol. 3(4), pages 253-258, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mehran Farzadmehr & Valentin Carlan & Thierry Vanelslander, 2023. "Contemporary challenges and AI solutions in port operations: applying Gale–Shapley algorithm to find best matches," Journal of Shipping and Trade, Springer, vol. 8(1), pages 1-44, December.
    2. Wei, Wei & Feng, Xiangnan, 2023. "Graphical representation and hierarchical decomposition mechanism for vertex-cover solution space," Applied Mathematics and Computation, Elsevier, vol. 458(C).
    3. Hend Bouziri & Khaled Mellouli & El-Ghazali Talbi, 2011. "The k-coloring fitness landscape," Journal of Combinatorial Optimization, Springer, vol. 21(3), pages 306-329, April.
    4. Xiaojuan Ning & Yule Liu & Yishu Ma & Zhiwei Lu & Haiyan Jin & Zhenghao Shi & Yinghui Wang, 2024. "TSPconv-Net: Transformer and Sparse Convolution for 3D Instance Segmentation in Point Clouds," Mathematics, MDPI, vol. 12(18), pages 1-15, September.
    5. Ekta Jain & Kalpana Dahiya & Vanita Verma, 2020. "A priority based unbalanced time minimization assignment problem," OPSEARCH, Springer;Operational Research Society of India, vol. 57(1), pages 13-45, March.
    6. Helena Gaspars-Wieloch, 2021. "The Assignment Problem in Human Resource Project Management under Uncertainty," Risks, MDPI, vol. 9(1), pages 1-17, January.
    7. Estate V. Khmaladze, 2021. "Distribution-free testing in linear and parametric regression," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(6), pages 1063-1087, December.
    8. Ivan Belik & Kurt Jornsten, 2018. "Critical objective function values in linear sum assignment problems," Journal of Combinatorial Optimization, Springer, vol. 35(3), pages 842-852, April.
    9. Amnon Rosenmann, 2022. "Computing the sequence of k-cardinality assignments," Journal of Combinatorial Optimization, Springer, vol. 44(2), pages 1265-1283, September.
    10. Li, Miao & Davari, Morteza & Goossens, Dries, 2023. "Multi-league sports scheduling with different leagues sizes," European Journal of Operational Research, Elsevier, vol. 307(1), pages 313-327.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0339650. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.