IDEAS home Printed from https://ideas.repec.org/a/spr/stabio/v15y2023i3d10.1007_s12561-022-09350-w.html
   My bibliography  Save this article

A Unified Bayesian Framework for Bi-overlapping-Clustering Multi-omics Data via Sparse Matrix Factorization

Author

Listed:
  • Fangting Zhou

    (Renmin University of China
    Texas A&M University)

  • Kejun He

    (Renmin University of China)

  • James J. Cai

    (Texas A&M University)

  • Laurie A. Davidson

    (Texas A&M University
    Texas A &M University)

  • Robert S. Chapkin

    (Texas A&M University
    Texas A &M University)

  • Yang Ni

    (Texas A&M University)

Abstract

The advances of modern sequencing techniques have generated an unprecedented amount of multi-omics data which provide great opportunities to quantitatively explore functional genomes from different but complementary perspectives. However, distinct modalities/sequencing technologies generate diverse types of data which greatly complicate statistical modeling because uniquely optimized methods are required for handling each type of data. In this paper, we propose a unified framework for Bayesian nonparametric matrix factorization that infers overlapping bi-clusters for multi-omics data. The proposed method adaptively discretizes different types of observations into common latent states on which cluster structures are built hierarchically. The proposed Bayesian nonparametric method is able to automatically determine the number of clusters. We demonstrate the utility of the proposed method using simulation studies and applications to a single-cell RNA-sequencing dataset, a combination of single-cell RNA-sequencing and single-cell ATAC-sequencing dataset, a bulk RNA-sequencing dataset, and a DNA methylation dataset which reveal several interesting findings that are consistent with biological literature.

Suggested Citation

  • Fangting Zhou & Kejun He & James J. Cai & Laurie A. Davidson & Robert S. Chapkin & Yang Ni, 2023. "A Unified Bayesian Framework for Bi-overlapping-Clustering Multi-omics Data via Sparse Matrix Factorization," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 15(3), pages 669-691, December.
  • Handle: RePEc:spr:stabio:v:15:y:2023:i:3:d:10.1007_s12561-022-09350-w
    DOI: 10.1007/s12561-022-09350-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s12561-022-09350-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s12561-022-09350-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. T T Cai & H Li & J Ma & Y Xia, 2019. "Differential Markov random field analysis with an application to detecting differential microbial community networks," Biometrika, Biometrika Trust, vol. 106(2), pages 401-416.
    2. Daniel D. Lee & H. Sebastian Seung, 1999. "Learning the parts of objects by non-negative matrix factorization," Nature, Nature, vol. 401(6755), pages 788-791, October.
    3. Jacques Banchereau & Ralph M. Steinman, 1998. "Dendritic cells and the control of immunity," Nature, Nature, vol. 392(6673), pages 245-252, March.
    4. Jong Kyoung Kim & Aleksandra A. Kolodziejczyk & Tomislav Ilicic & Sarah A. Teichmann & John C. Marioni, 2015. "Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression," Nature Communications, Nature, vol. 6(1), pages 1-9, December.
    5. Stephen Johnson, 1967. "Hierarchical clustering schemes," Psychometrika, Springer;The Psychometric Society, vol. 32(3), pages 241-254, September.
    6. Jason D. Buenrostro & Beijing Wu & Ulrike M. Litzenburger & Dave Ruff & Michael L. Gonzales & Michael P. Snyder & Howard Y. Chang & William J. Greenleaf, 2015. "Single-cell chromatin accessibility reveals principles of regulatory variation," Nature, Nature, vol. 523(7561), pages 486-490, July.
    7. Veronika Ročková & Edward I. George, 2016. "Fast Bayesian Factor Analysis via Automatic Rotations to Sparsity," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1608-1622, October.
    8. Giovanni Parmigiani & Elizabeth S. Garrett & Ramaswamy Anbazhagan & Edward Gabrielson, 2002. "A statistical framework for expression‐based molecular classification in cancer," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 717-736, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wentao Qu & Xianchao Xiu & Huangyue Chen & Lingchen Kong, 2023. "A Survey on High-Dimensional Subspace Clustering," Mathematics, MDPI, vol. 11(2), pages 1-39, January.
    2. Rafael Teixeira & Mário Antunes & Diogo Gomes & Rui L. Aguiar, 2024. "Comparison of Semantic Similarity Models on Constrained Scenarios," Information Systems Frontiers, Springer, vol. 26(4), pages 1307-1330, August.
    3. Del Corso, Gianna M. & Romani, Francesco, 2019. "Adaptive nonnegative matrix factorization and measure comparisons for recommender systems," Applied Mathematics and Computation, Elsevier, vol. 354(C), pages 164-179.
    4. P Fogel & C Geissler & P Cotte & G Luta, 2022. "Applying separative non-negative matrix factorization to extra-financial data," Working Papers hal-03689774, HAL.
    5. repec:ers:journl:v:xxiv:y:2021:i:4b:p:659-667 is not listed on IDEAS
    6. Kim, Junyung & Shah, Asad Ullah Amin & Kang, Hyun Gook, 2020. "Dynamic risk assessment with bayesian network and clustering analysis," Reliability Engineering and System Safety, Elsevier, vol. 201(C).
    7. Thanh Loc Nguyen & Youngjin Choi & Jihye Im & Hyunsu Shin & Ngoc Man Phan & Min Kyung Kim & Seung Woo Choi & Jaeyun Kim, 2022. "Immunosuppressive biomaterial-based therapeutic vaccine to treat multiple sclerosis via re-establishing immune tolerance," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    8. Spelta, A. & Pecora, N. & Rovira Kaltwasser, P., 2019. "Identifying Systemically Important Banks: A temporal approach for macroprudential policies," Journal of Policy Modeling, Elsevier, vol. 41(1), pages 197-218.
    9. Wei Jin & Jingchun Ma & Li Rong & Shengshuo Huang & Tuo Li & Guoxiang Jin & Zhongjun Zhou, 2025. "Semi-automated IT-scATAC-seq profiles cell-specific chromatin accessibility in differentiation and peripheral blood populations," Nature Communications, Nature, vol. 16(1), pages 1-13, December.
    10. David G Mets & Michael S Brainard, 2018. "An automated approach to the quantitation of vocalizations and vocal learning in the songbird," PLOS Computational Biology, Public Library of Science, vol. 14(8), pages 1-29, August.
    11. Paul Fogel & Yann Gaston-Mathé & Douglas Hawkins & Fajwel Fogel & George Luta & S. Stanley Young, 2016. "Applications of a Novel Clustering Approach Using Non-Negative Matrix Factorization to Environmental Research in Public Health," IJERPH, MDPI, vol. 13(5), pages 1-14, May.
    12. Le Thi Khanh Hien & Duy Nhat Phan & Nicolas Gillis, 2022. "Inertial alternating direction method of multipliers for non-convex non-smooth optimization," Computational Optimization and Applications, Springer, vol. 83(1), pages 247-285, September.
    13. Noah E. Friedkin, 1984. "Structural Cohesion and Equivalence Explanations of Social Homogeneity," Sociological Methods & Research, , vol. 12(3), pages 235-261, February.
    14. David Matesanz Gomez & Guillermo J. Ortega & Benno Torgler, 2011. "Measuring globalization: A hierarchical network approach," CREMA Working Paper Series 2011-11, Center for Research in Economics, Management and the Arts (CREMA).
    15. Balepur, Prashant Narayan, 1998. "Impacts of Computer-Mediated Communication on Travel and Communication Patterns: The Davis Community Network Study," Institute of Transportation Studies, Research Reports, Working Papers, Proceedings qt6cb1f85c, Institute of Transportation Studies, UC Berkeley.
    16. Lisa Price, 2001. "Demystifying farmers' entomological and pest management knowledge: A methodology for assessing the impacts on knowledge from IPM-FFS and NES interventions," Agriculture and Human Values, Springer;The Agriculture, Food, & Human Values Society (AFHVS), vol. 18(2), pages 153-176, June.
    17. Elisa Frutos-Bernal & Ángel Martín del Rey & Irene Mariñas-Collado & María Teresa Santos-Martín, 2022. "An Analysis of Travel Patterns in Barcelona Metro Using Tucker3 Decomposition," Mathematics, MDPI, vol. 10(7), pages 1-17, March.
    18. Jingfeng Guo & Chao Zheng & Shanshan Li & Yutong Jia & Bin Liu, 2022. "BiInfGCN: Bilateral Information Augmentation of Graph Convolutional Networks for Recommendation," Mathematics, MDPI, vol. 10(17), pages 1-16, August.
    19. Geert Soete & Wayne DeSarbo & J. Carroll, 1985. "Optimal variable weighting for hierarchical clustering: An alternating least-squares algorithm," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 173-192, December.
    20. Teh, Boon Kin & Goo, Yik Wen & Lian, Tong Wei & Ong, Wei Guang & Choi, Wen Ting & Damodaran, Mridula & Cheong, Siew Ann, 2015. "The Chinese Correction of February 2007: How financial hierarchies change in a market crash," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 424(C), pages 225-241.
    21. Jianfei Cao & Han Yang & Jianshu Lv & Quanyuan Wu & Baolei Zhang, 2023. "Estimating Soil Salinity with Different Levels of Vegetation Cover by Using Hyperspectral and Non-Negative Matrix Factorization Algorithm," IJERPH, MDPI, vol. 20(4), pages 1-15, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stabio:v:15:y:2023:i:3:d:10.1007_s12561-022-09350-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.