IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0238996.html
   My bibliography  Save this article

Directionally dependent multi-view clustering using copula model

Author

Listed:
  • Kahkashan Afrin
  • Ashif S Iquebal
  • Mostafa Karimi
  • Allyson Souris
  • Se Yoon Lee
  • Bani K Mallick

Abstract

Recent developments in high-throughput methods have resulted in the collection of high-dimensional data types from multiple sources and technologies that measure distinct yet complementary information. Integrated clustering of such multiple data types or multi-view clustering is critical for revealing pathological insights. However, multi-view clustering is challenging due to the complex dependence structure between multiple data types, including directional dependency. Specifically, genomics data types have pre-specified directional dependencies known as the central dogma that describes the process of information flow from DNA to messenger RNA (mRNA) and then from mRNA to protein. Most of the existing multi-view clustering approaches assume an independent structure or pair-wise (non-directional) dependence between data types, thereby ignoring their directional relationship. Motivated by this, we propose a biology-inspired Bayesian integrated multi-view clustering model that uses an asymmetric copula to accommodate the directional dependencies between the data types. Via extensive simulation experiments, we demonstrate the negative impact of ignoring directional dependency on clustering performance. We also present an application of our model to a real-world dataset of breast cancer tumor samples collected from The Cancer Genome Altas program and provide comparative results.

Suggested Citation

  • Kahkashan Afrin & Ashif S Iquebal & Mostafa Karimi & Allyson Souris & Se Yoon Lee & Bani K Mallick, 2020. "Directionally dependent multi-view clustering using copula model," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-18, October.
  • Handle: RePEc:plo:pone00:0238996
    DOI: 10.1371/journal.pone.0238996
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0238996
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0238996&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0238996?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Trivedi, Pravin K. & Zimmer, David M., 2007. "Copula Modeling: An Introduction for Practitioners," Foundations and Trends(R) in Econometrics, now publishers, vol. 1(1), pages 1-111, April.
    2. I. Bairamov & S. Kotz & M. Bekci, 2001. "New generalized Farlie-Gumbel-Morgenstern distributions and concomitants of order statistics," Journal of Applied Statistics, Taylor & Francis Journals, vol. 28(5), pages 521-536.
    3. Liebscher, Eckhard, 2008. "Construction of asymmetric multivariate copulas," Journal of Multivariate Analysis, Elsevier, vol. 99(10), pages 2234-2250, November.
    4. Altman, Rachel MacKay, 2007. "Mixed Hidden Markov Models: An Extension of the Hidden Markov Model to the Longitudinal Data Setting," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 201-210, March.
    5. Antonello Maruotti, 2011. "Mixed Hidden Markov Models for Longitudinal Data: An Overview," International Statistical Review, International Statistical Institute, vol. 79(3), pages 427-454, December.
    6. Xian F Mallory & Mohammadamin Edrisi & Nicholas Navin & Luay Nakhleh, 2020. "Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data," PLOS Computational Biology, Public Library of Science, vol. 16(7), pages 1-24, July.
    7. Kraus, Daniel & Czado, Claudia, 2017. "D-vine copula based quantile regression," Computational Statistics & Data Analysis, Elsevier, vol. 110(C), pages 1-18.
    8. Rodríguez-Lallena, José Antonio & Úbeda-Flores, Manuel, 2004. "A new class of bivariate copulas," Statistics & Probability Letters, Elsevier, vol. 66(3), pages 315-325, February.
    9. repec:dau:papers:123456789/4648 is not listed on IDEAS
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marino, Maria Francesca & Alfó, Marco, 2016. "Gaussian quadrature approximations in mixed hidden Markov models for longitudinal data: A simulation study," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 193-209.
    2. Antonello Maruotti & Jan Bulla & Tanya Mark, 2019. "Assessing the influence of marketing activities on customer behaviors: a dynamic clustering approach," METRON, Springer;Sapienza Università di Roma, vol. 77(1), pages 19-42, April.
    3. Montanari, Giorgio E. & Doretti, Marco & Bartolucci, Francesco, 2017. "A multilevel latent Markov model for the evaluation of nursing homes' performance," MPRA Paper 80691, University Library of Munich, Germany.
    4. Jesse D. Raffa & Joel A. Dubin, 2015. "Multivariate longitudinal data analysis with mixed effects hidden Markov models," Biometrics, The International Biometric Society, vol. 71(3), pages 821-831, September.
    5. Baker, Rose, 2008. "An order-statistics-based method for constructing multivariate distributions with fixed marginals," Journal of Multivariate Analysis, Elsevier, vol. 99(10), pages 2312-2327, November.
    6. Arbel, Julyan & Crispino, Marta & Girard, Stéphane, 2019. "Dependence properties and Bayesian inference for asymmetric multivariate copulas," Journal of Multivariate Analysis, Elsevier, vol. 174(C).
    7. Zhou, Jie & Song, Xinyuan & Sun, Liuquan, 2020. "Continuous time hidden Markov model for longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 179(C).
    8. F. Bartolucci & A. Farcomeni & F. Pennoni, 2014. "Latent Markov models: a review of a general framework for the analysis of longitudinal data with covariates," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 23(3), pages 433-465, September.
    9. Indranil Ghosh, 2017. "Bivariate Kumaraswamy Models via Modified FGM Copulas: Properties and Applications," JRFM, MDPI, vol. 10(4), pages 1-13, November.
    10. Zhang, Yi & Gomes, António Topa & Beer, Michael & Neumann, Ingo & Nackenhorst, Udo & Kim, Chul-Woo, 2019. "Reliability analysis with consideration of asymmetrically dependent variables: Discussion and application to geotechnical examples," Reliability Engineering and System Safety, Elsevier, vol. 185(C), pages 261-277.
    11. Jiang, Jun & Tang, Qihe, 2011. "The product of two dependent random variables with regularly varying or rapidly varying tails," Statistics & Probability Letters, Elsevier, vol. 81(8), pages 957-961, August.
    12. Gordon Anderson & Alessio Farcomeni & Maria Grazia Pittau & Roberto Zelli, 2019. "Rectangular latent Markov models for time‐specific clustering, with an analysis of the wellbeing of nations," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 68(3), pages 603-621, April.
    13. Bairamov, I. & Bayramoglu, K., 2013. "From the Huang–Kotz FGM distribution to Baker’s bivariate distribution," Journal of Multivariate Analysis, Elsevier, vol. 113(C), pages 106-115.
    14. Gordon Anderson & Alessio Farcomeni & Grazia Pittau & Roberto Zelli, 2017. "Rectangular latent Markov models for time-specific clustering," Working Papers tecipa-589, University of Toronto, Department of Economics.
    15. Komelj, Janez & Perman, Mihael, 2010. "Joint characteristic functions construction via copulas," Insurance: Mathematics and Economics, Elsevier, vol. 47(2), pages 137-143, October.
    16. Luca Merlo & Lea Petrella & Nikos Tzavidis, 2022. "Quantile mixed hidden Markov models for multivariate longitudinal data: An application to children's Strengths and Difficulties Questionnaire scores," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(2), pages 417-448, March.
    17. Francesco Lagona & Antonello Maruotti & Fabio Padovano, 2015. "Multilevel multivariate modelling of legislative count data, with a hidden Markov chain," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 178(3), pages 705-723, June.
    18. Roland Langrock & Timo Adam & Vianey Leos‐Barajas & Sina Mews & David L. Miller & Yannis P. Papastamatiou, 2018. "Spline‐based nonparametric inference in general state‐switching models," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 72(3), pages 179-200, August.
    19. Giorgio E. Montanari & Marco Doretti, 2019. "Ranking Nursing Homes’ Performances Through a Latent Markov Model with Fixed and Random Effects," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 146(1), pages 307-326, November.
    20. Maria Marino & Marco Alfó, 2015. "Latent drop-out based transitions in linear quantile hidden Markov models for longitudinal responses with attrition," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 9(4), pages 483-502, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0238996. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.