IDEAS home Printed from https://ideas.repec.org/a/spr/stabio/v15y2023i3d10.1007_s12561-022-09350-w.html
   My bibliography  Save this article

A Unified Bayesian Framework for Bi-overlapping-Clustering Multi-omics Data via Sparse Matrix Factorization

Author

Listed:
  • Fangting Zhou

    (Renmin University of China
    Texas A&M University)

  • Kejun He

    (Renmin University of China)

  • James J. Cai

    (Texas A&M University)

  • Laurie A. Davidson

    (Texas A&M University
    Texas A &M University)

  • Robert S. Chapkin

    (Texas A&M University
    Texas A &M University)

  • Yang Ni

    (Texas A&M University)

Abstract

The advances of modern sequencing techniques have generated an unprecedented amount of multi-omics data which provide great opportunities to quantitatively explore functional genomes from different but complementary perspectives. However, distinct modalities/sequencing technologies generate diverse types of data which greatly complicate statistical modeling because uniquely optimized methods are required for handling each type of data. In this paper, we propose a unified framework for Bayesian nonparametric matrix factorization that infers overlapping bi-clusters for multi-omics data. The proposed method adaptively discretizes different types of observations into common latent states on which cluster structures are built hierarchically. The proposed Bayesian nonparametric method is able to automatically determine the number of clusters. We demonstrate the utility of the proposed method using simulation studies and applications to a single-cell RNA-sequencing dataset, a combination of single-cell RNA-sequencing and single-cell ATAC-sequencing dataset, a bulk RNA-sequencing dataset, and a DNA methylation dataset which reveal several interesting findings that are consistent with biological literature.

Suggested Citation

  • Fangting Zhou & Kejun He & James J. Cai & Laurie A. Davidson & Robert S. Chapkin & Yang Ni, 2023. "A Unified Bayesian Framework for Bi-overlapping-Clustering Multi-omics Data via Sparse Matrix Factorization," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 15(3), pages 669-691, December.
  • Handle: RePEc:spr:stabio:v:15:y:2023:i:3:d:10.1007_s12561-022-09350-w
    DOI: 10.1007/s12561-022-09350-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s12561-022-09350-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s12561-022-09350-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. T T Cai & H Li & J Ma & Y Xia, 2019. "Differential Markov random field analysis with an application to detecting differential microbial community networks," Biometrika, Biometrika Trust, vol. 106(2), pages 401-416.
    2. Stephen Johnson, 1967. "Hierarchical clustering schemes," Psychometrika, Springer;The Psychometric Society, vol. 32(3), pages 241-254, September.
    3. Daniel D. Lee & H. Sebastian Seung, 1999. "Learning the parts of objects by non-negative matrix factorization," Nature, Nature, vol. 401(6755), pages 788-791, October.
    4. Jacques Banchereau & Ralph M. Steinman, 1998. "Dendritic cells and the control of immunity," Nature, Nature, vol. 392(6673), pages 245-252, March.
    5. Jason D. Buenrostro & Beijing Wu & Ulrike M. Litzenburger & Dave Ruff & Michael L. Gonzales & Michael P. Snyder & Howard Y. Chang & William J. Greenleaf, 2015. "Single-cell chromatin accessibility reveals principles of regulatory variation," Nature, Nature, vol. 523(7561), pages 486-490, July.
    6. Veronika Ročková & Edward I. George, 2016. "Fast Bayesian Factor Analysis via Automatic Rotations to Sparsity," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1608-1622, October.
    7. Jong Kyoung Kim & Aleksandra A. Kolodziejczyk & Tomislav Ilicic & Sarah A. Teichmann & John C. Marioni, 2015. "Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression," Nature Communications, Nature, vol. 6(1), pages 1-9, December.
    8. Giovanni Parmigiani & Elizabeth S. Garrett & Ramaswamy Anbazhagan & Edward Gabrielson, 2002. "A statistical framework for expression‐based molecular classification in cancer," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 717-736, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wentao Qu & Xianchao Xiu & Huangyue Chen & Lingchen Kong, 2023. "A Survey on High-Dimensional Subspace Clustering," Mathematics, MDPI, vol. 11(2), pages 1-39, January.
    2. Rafael Teixeira & Mário Antunes & Diogo Gomes & Rui L. Aguiar, 2024. "Comparison of Semantic Similarity Models on Constrained Scenarios," Information Systems Frontiers, Springer, vol. 26(4), pages 1307-1330, August.
    3. Del Corso, Gianna M. & Romani, Francesco, 2019. "Adaptive nonnegative matrix factorization and measure comparisons for recommender systems," Applied Mathematics and Computation, Elsevier, vol. 354(C), pages 164-179.
    4. P Fogel & C Geissler & P Cotte & G Luta, 2022. "Applying separative non-negative matrix factorization to extra-financial data," Working Papers hal-03689774, HAL.
    5. Xiao-Bai Li & Jialun Qin, 2017. "Anonymizing and Sharing Medical Text Records," Information Systems Research, INFORMS, vol. 28(2), pages 332-352, June.
    6. Claudia Quinteros-Cartaya & Guillermo Solorio-Magaña & Francisco Javier Núñez-Cornú & Felipe de Jesús Escalona-Alcázar & Diana Núñez, 2023. "Microearthquakes in the Guadalajara Metropolitan Zone, Mexico: evidence from buried active faults in Tesistán Valley, Zapopan," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 116(3), pages 2797-2818, April.
    7. repec:ers:journl:v:xxiv:y:2021:i:4b:p:659-667 is not listed on IDEAS
    8. Kim, Junyung & Shah, Asad Ullah Amin & Kang, Hyun Gook, 2020. "Dynamic risk assessment with bayesian network and clustering analysis," Reliability Engineering and System Safety, Elsevier, vol. 201(C).
    9. Thanh Loc Nguyen & Youngjin Choi & Jihye Im & Hyunsu Shin & Ngoc Man Phan & Min Kyung Kim & Seung Woo Choi & Jaeyun Kim, 2022. "Immunosuppressive biomaterial-based therapeutic vaccine to treat multiple sclerosis via re-establishing immune tolerance," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    10. János Abonyi & Ádám Ipkovich & Gyula Dörgő & Károly Héberger, 2023. "Matrix factorization-based multi-objective ranking–What makes a good university?," PLOS ONE, Public Library of Science, vol. 18(4), pages 1-30, April.
    11. Naiyang Guan & Lei Wei & Zhigang Luo & Dacheng Tao, 2013. "Limited-Memory Fast Gradient Descent Method for Graph Regularized Nonnegative Matrix Factorization," PLOS ONE, Public Library of Science, vol. 8(10), pages 1-10, October.
    12. Roberts, Leigh, 2014. "Consistent estimation of breakpoints in time series, with application to wavelet analysis of Citigroup returns," Working Paper Series 18815, Victoria University of Wellington, School of Economics and Finance.
    13. Spelta, A. & Pecora, N. & Rovira Kaltwasser, P., 2019. "Identifying Systemically Important Banks: A temporal approach for macroprudential policies," Journal of Policy Modeling, Elsevier, vol. 41(1), pages 197-218.
    14. M. Moghadam & K. Aminian & M. Asghari & M. Parnianpour, 2013. "How well do the muscular synergies extracted via non-negative matrix factorisation explain the variation of torque at shoulder joint?," Computer Methods in Biomechanics and Biomedical Engineering, Taylor & Francis Journals, vol. 16(3), pages 291-301.
    15. Markovsky, Ivan & Niranjan, Mahesan, 2010. "Approximate low-rank factorization with structured factors," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3411-3420, December.
    16. Wei Jin & Jingchun Ma & Li Rong & Shengshuo Huang & Tuo Li & Guoxiang Jin & Zhongjun Zhou, 2025. "Semi-automated IT-scATAC-seq profiles cell-specific chromatin accessibility in differentiation and peripheral blood populations," Nature Communications, Nature, vol. 16(1), pages 1-13, December.
    17. David G Mets & Michael S Brainard, 2018. "An automated approach to the quantitation of vocalizations and vocal learning in the songbird," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-29, August.
    18. Paul Fogel & Yann Gaston-Mathé & Douglas Hawkins & Fajwel Fogel & George Luta & S. Stanley Young, 2016. "Applications of a Novel Clustering Approach Using Non-Negative Matrix Factorization to Environmental Research in Public Health," IJERPH, MDPI, vol. 13(5), pages 1-14, May.
    19. Le Thi Khanh Hien & Duy Nhat Phan & Nicolas Gillis, 2022. "Inertial alternating direction method of multipliers for non-convex non-smooth optimization," Computational Optimization and Applications, Springer, vol. 83(1), pages 247-285, September.
    20. Zhaoyu Xing & Yang Wan & Juan Wen & Wei Zhong, 2024. "GOLFS: feature selection via combining both global and local information for high dimensional clustering," Computational Statistics, Springer, vol. 39(5), pages 2651-2675, July.
    21. Michael Brusco & J Dennis Cradit & Douglas Steinley, 2021. "A comparison of 71 binary similarity coefficients: The effect of base rates," PLOS ONE, Public Library of Science, vol. 16(4), pages 1-19, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stabio:v:15:y:2023:i:3:d:10.1007_s12561-022-09350-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.