IDEAS home Printed from https://ideas.repec.org/a/bpj/sagmbi/v22y2023i1p14n1.html
   My bibliography  Save this article

Randomized singular value decomposition for integrative subtype analysis of ‘omics data’ using non-negative matrix factorization

Author

Listed:
  • Ni Yonghui
  • He Jianghua
  • Chalise Prabhakar

    (Department of Biostatistics and Data Science, University of Kansas Medical Center, 3901 Rainbow Blvd, Kansas City, KS 66160, USA)

Abstract

Integration of multiple ‘omics datasets for differentiating cancer subtypes is a powerful technic that leverages the consistent and complementary information across multi-omics data. Matrix factorization is a common technique used in integrative clustering for identifying latent subtype structure across multi-omics data. High dimensionality of the omics data and long computation time have been common challenges of clustering methods. In order to address the challenges, we propose randomized singular value decomposition (RSVD) for integrative clustering using Non-negative Matrix Factorization: intNMF-rsvd. The method utilizes RSVD to reduce the dimensionality by projecting the data into eigen vector space with user specified lower rank. Then, clustering analysis is carried out by estimating common basis matrix across the projected multi-omics datasets. The performance of the proposed method was assessed using the simulated datasets and compared with six state-of-the-art integrative clustering methods using real-life datasets from The Cancer Genome Atlas Study. intNMF-rsvd was found working efficiently and competitively as compared to standard intNMF and other multi-omics clustering methods. Most importantly, intNMF-rsvd can handle large number of features and significantly reduce the computation time. The identified subtypes can be utilized for further clinical association studies to understand the etiology of the disease.

Suggested Citation

  • Ni Yonghui & He Jianghua & Chalise Prabhakar, 2023. "Randomized singular value decomposition for integrative subtype analysis of ‘omics data’ using non-negative matrix factorization," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 22(1), pages 1-14, January.
  • Handle: RePEc:bpj:sagmbi:v:22:y:2023:i:1:p:14:n:1
    DOI: 10.1515/sagmb-2022-0047
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/sagmb-2022-0047
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/sagmb-2022-0047?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:sagmbi:v:22:y:2023:i:1:p:14:n:1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.