IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0203247.html
   My bibliography  Save this article

Determination of essential phenotypic elements of clusters in high-dimensional entities—DEPECHE

Author

Listed:
  • Axel Theorell
  • Yenan Troi Bryceson
  • Jakob Theorell

Abstract

Technological advances have facilitated an exponential increase in the amount of information that can be derived from single cells, necessitating new computational tools that can make such highly complex data interpretable. Here, we introduce DEPECHE, a rapid, parameter free, sparse k-means-based algorithm for clustering of multi- and megavariate single-cell data. In a number of computational benchmarks aimed at evaluating the capacity to form biologically relevant clusters, including flow/mass-cytometry and single cell RNA sequencing data sets with manually curated gold standard solutions, DEPECHE clusters as well or better than the currently available best performing clustering algorithms. However, the main advantage of DEPECHE, compared to the state-of-the-art, is its unique ability to enhance interpretability of the formed clusters, in that it only retains variables relevant for cluster separation, thereby facilitating computational efficient analyses as well as understanding of complex datasets. DEPECHE is implemented in the open source R package DepecheR currently available at github.com/Theorell/DepecheR.

Suggested Citation

  • Axel Theorell & Yenan Troi Bryceson & Jakob Theorell, 2019. "Determination of essential phenotypic elements of clusters in high-dimensional entities—DEPECHE," PLOS ONE, Public Library of Science, vol. 14(3), pages 1-15, March.
  • Handle: RePEc:plo:pone00:0203247
    DOI: 10.1371/journal.pone.0203247
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0203247
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0203247&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0203247?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Amos Tanay & Aviv Regev, 2017. "Scaling single-cell genomics from phenomenology to mechanism," Nature, Nature, vol. 541(7637), pages 331-338, January.
    2. Witten, Daniela M. & Tibshirani, Robert, 2010. "A Framework for Feature Selection in Clustering," Journal of the American Statistical Association, American Statistical Association, vol. 105(490), pages 713-726.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yaeji Lim & Hee-Seok Oh & Ying Kuen Cheung, 2019. "Multiscale Clustering for Functional Data," Journal of Classification, Springer;The Classification Society, vol. 36(2), pages 368-391, July.
    2. Yujia Li & Xiangrui Zeng & Chien‐Wei Lin & George C. Tseng, 2022. "Simultaneous estimation of cluster number and feature sparsity in high‐dimensional cluster analysis," Biometrics, The International Biometric Society, vol. 78(2), pages 574-585, June.
    3. Dong Liu & Changwei Zhao & Yong He & Lei Liu & Ying Guo & Xinsheng Zhang, 2023. "Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix‐variate fMRI data," Biometrics, The International Biometric Society, vol. 79(3), pages 2246-2259, September.
    4. Jeffrey Andrews & Paul McNicholas, 2014. "Variable Selection for Clustering and Classification," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 136-153, July.
    5. Samuel S. Kim & Buu Truong & Karthik Jagadeesh & Kushal K. Dey & Amber Z. Shen & Soumya Raychaudhuri & Manolis Kellis & Alkes L. Price, 2024. "Leveraging single-cell ATAC-seq and RNA-seq to identify disease-critical fetal and adult brain cell types," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    6. Kai Deng & Xin Zhang, 2022. "Tensor envelope mixture model for simultaneous clustering and multiway dimension reduction," Biometrics, The International Biometric Society, vol. 78(3), pages 1067-1079, September.
    7. J. Fernando Vera & Rodrigo Macías, 2021. "On the Behaviour of K-Means Clustering of a Dissimilarity Matrix by Means of Full Multidimensional Scaling," Psychometrika, Springer;The Psychometric Society, vol. 86(2), pages 489-513, June.
    8. Peter Radchenko & Gourab Mukherjee, 2017. "Convex clustering via l 1 fusion penalization," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1527-1546, November.
    9. Zhiguang Huo & Li Zhu & Tianzhou Ma & Hongcheng Liu & Song Han & Daiqing Liao & Jinying Zhao & George Tseng, 2020. "Two-Way Horizontal and Vertical Omics Integration for Disease Subtype Discovery," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(1), pages 1-22, April.
    10. Baolin Wu, 2013. "Sparse cluster analysis of large-scale discrete variables with application to single nucleotide polymorphism data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 40(2), pages 358-367, February.
    11. Peña, Daniel & Prieto Fernández, Francisco Javier & Rendon Aguirre, Janeth Carolina, 2017. "Clustering Big Data by Extreme Kurtosis Projections," DES - Working Papers. Statistics and Econometrics. WS 24522, Universidad Carlos III de Madrid. Departamento de Estadística.
    12. Charles Bouveyron & Camille Brunet-Saumard, 2014. "Discriminative variable selection for clustering with the sparse Fisher-EM algorithm," Computational Statistics, Springer, vol. 29(3), pages 489-513, June.
    13. Floriello, Davide & Vitelli, Valeria, 2017. "Sparse clustering of functional data," Journal of Multivariate Analysis, Elsevier, vol. 154(C), pages 1-18.
    14. Corinna Kleinert & Alexander Vosseler & Uwe Blien, 2018. "Classifying vocational training markets," The Annals of Regional Science, Springer;Western Regional Science Association, vol. 61(1), pages 31-48, July.
    15. Hosik Choi & Seokho Lee, 2019. "Convex clustering for binary data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(4), pages 991-1018, December.
    16. Maarten M. Kampert & Jacqueline J. Meulman & Jerome H. Friedman, 2017. "rCOSA: A Software Package for Clustering Objects on Subsets of Attributes," Journal of Classification, Springer;The Classification Society, vol. 34(3), pages 514-547, October.
    17. Pi, J. & Wang, Honggang & Pardalos, Panos M., 2021. "A dual reformulation and solution framework for regularized convex clustering problems," European Journal of Operational Research, Elsevier, vol. 290(3), pages 844-856.
    18. Calin-Adrian Comes & Elena Bunduchi & Valentina Vasile & Daniel Stefan, 2018. "The Impact of Foreign Direct Investments and Remittances on Economic Growth: A Case Study in Central and Eastern Europe," Sustainability, MDPI, vol. 10(1), pages 1-16, January.
    19. Gaynor, Sheila & Bair, Eric, 2017. "Identification of relevant subtypes via preweighted sparse clustering," Computational Statistics & Data Analysis, Elsevier, vol. 116(C), pages 139-154.
    20. Zengchao Xu & Shan Luo & Zehua Chen, 2023. "A Portmanteau Local Feature Discrimination Approach to the Classification with High-dimensional Matrix-variate Data," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 85(1), pages 441-467, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0203247. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.