IDEAS home Printed from https://ideas.repec.org/a/spr/jglopt/v74y2019i4d10.1007_s10898-017-0578-x.html
   My bibliography  Save this article

Hybrid clustering based on content and connection structure using joint nonnegative matrix factorization

Author

Listed:
  • Rundong Du

    (Georgia Institute of Technology)

  • Barry Drake

    (Georgia Institute of Technology)

  • Haesun Park

    (Georgia Institute of Technology)

Abstract

A hybrid method called JointNMF is presented which is applied to latent information discovery from data sets that contain both text content and connection structure information. The new method jointly optimizes an integrated objective function, which is a combination of two components: the Nonnegative Matrix Factorization (NMF) objective function for handling text content and the Symmetric NMF (SymNMF) objective function for handling network structure information. An effective algorithm for the joint NMF objective function is proposed so that the efficient method of block coordinate descent framework can be utilized. The proposed hybrid method simultaneously discovers content associations and related latent connections without any need for postprocessing of additional clustering. It is shown that the proposed method can also be applied when the text content is associated with hypergraph edges. An additional capability of the JointNMF is prediction of unknown network information which is illustrated using several real world problems such as citation recommendations of papers and leader detection in organizations. The proposed method can also be applied to general data expressed with both feature space vectors and pairwise similarities and can be extended to the case with multiple feature spaces or multiple similarity measures. Our experimental results illustrate multiple advantages of the proposed hybrid method when both content and connection structure information is available in the data for obtaining higher quality clustering results and discovery of new information such as unknown link prediction.

Suggested Citation

  • Rundong Du & Barry Drake & Haesun Park, 2019. "Hybrid clustering based on content and connection structure using joint nonnegative matrix factorization," Journal of Global Optimization, Springer, vol. 74(4), pages 861-877, August.
  • Handle: RePEc:spr:jglopt:v:74:y:2019:i:4:d:10.1007_s10898-017-0578-x
    DOI: 10.1007/s10898-017-0578-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10898-017-0578-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10898-017-0578-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jingu Kim & Yunlong He & Haesun Park, 2014. "Algorithms for nonnegative matrix and tensor factorizations: a unified view based on block coordinate descent framework," Journal of Global Optimization, Springer, vol. 58(2), pages 285-319, February.
    2. Da Kuang & Sangwoon Yun & Haesun Park, 2015. "SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering," Journal of Global Optimization, Springer, vol. 62(3), pages 545-574, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rundong Du & Da Kuang & Barry Drake & Haesun Park, 2017. "DC-NMF: nonnegative matrix factorization based on divide-and-conquer for fast clustering and topic modeling," Journal of Global Optimization, Springer, vol. 68(4), pages 777-798, August.
    2. Srinivas Eswar & Ramakrishnan Kannan & Richard Vuduc & Haesun Park, 2021. "ORCA: Outlier detection and Robust Clustering for Attributed graphs," Journal of Global Optimization, Springer, vol. 81(4), pages 967-989, December.
    3. Gillis, Nicolas & Glineur, François & Tuyttens, Daniel & Vandaele, Arnaud, 2015. "Heuristics for exact nonnegative matrix factorization," LIDAM Discussion Papers CORE 2015006, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    4. Arnaud Vandaele & François Glineur & Nicolas Gillis, 2018. "Algorithms for positive semidefinite factorization," Computational Optimization and Applications, Springer, vol. 71(1), pages 193-219, September.
    5. Takehiro Sano & Tsuyoshi Migita & Norikazu Takahashi, 2022. "A novel update rule of HALS algorithm for nonnegative matrix factorization and Zangwill’s global convergence," Journal of Global Optimization, Springer, vol. 84(3), pages 755-781, November.
    6. April R. Kriebel & Joshua D. Welch, 2022. "UINMF performs mosaic integration of single-cell multi-omic datasets using nonnegative matrix factorization," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    7. Andrej Čopar & Blaž Zupan & Marinka Zitnik, 2019. "Fast optimization of non-negative matrix tri-factorization," PLOS ONE, Public Library of Science, vol. 14(6), pages 1-15, June.
    8. Duy Khuong Nguyen & Tu Bao Ho, 2017. "Accelerated parallel and distributed algorithm using limited internal memory for nonnegative matrix factorization," Journal of Global Optimization, Springer, vol. 68(2), pages 307-328, June.
    9. Flavia Esposito, 2021. "A Review on Initialization Methods for Nonnegative Matrix Factorization: Towards Omics Data Experiments," Mathematics, MDPI, vol. 9(9), pages 1-17, April.
    10. Da Kuang & Sangwoon Yun & Haesun Park, 2015. "SymNMF: nonnegative low-rank approximation of a similarity matrix for graph clustering," Journal of Global Optimization, Springer, vol. 62(3), pages 545-574, July.
    11. Yang Qi, 2018. "A Very Brief Introduction to Nonnegative Tensors from the Geometric Viewpoint," Mathematics, MDPI, vol. 6(11), pages 1-19, October.
    12. Radu-Alexandru Dragomir & Alexandre d’Aspremont & Jérôme Bolte, 2021. "Quartic First-Order Methods for Low-Rank Minimization," Journal of Optimization Theory and Applications, Springer, vol. 189(2), pages 341-363, May.
    13. He, Chaobo & Zhang, Qiong & Tang, Yong & Liu, Shuangyin & Zheng, Jianhua, 2019. "Community detection method based on robust semi-supervised nonnegative matrix factorization," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 523(C), pages 279-291.
    14. Saeedmanesh, Mohammadreza & Geroliminis, Nikolas, 2016. "Clustering of heterogeneous networks with directional flows based on “Snake” similarities," Transportation Research Part B: Methodological, Elsevier, vol. 91(C), pages 250-269.
    15. Norikazu Takahashi & Jiro Katayama & Masato Seki & Jun’ichi Takeuchi, 2018. "A unified global convergence analysis of multiplicative update rules for nonnegative matrix factorization," Computational Optimization and Applications, Springer, vol. 71(1), pages 221-250, September.
    16. Johannes Friedrich & Weijian Yang & Daniel Soudry & Yu Mu & Misha B Ahrens & Rafael Yuste & Darcy S Peterka & Liam Paninski, 2017. "Multi-scale approaches for high-speed imaging and analysis of large neural populations," PLOS Computational Biology, Public Library of Science, vol. 13(8), pages 1-24, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jglopt:v:74:y:2019:i:4:d:10.1007_s10898-017-0578-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.