IDEAS home Printed from https://ideas.repec.org/a/plo/pcsy00/0000023.html
   My bibliography  Save this article

Vertex clustering in diverse dynamic networks

Author

Listed:
  • Devavrat Vivek Dabke
  • Olga Dorabiala

Abstract

We present theoretical and experimental results for spatiotemporal graph k-means (STGkM)—a new unsupervised method to cluster vertices within a dynamic network. STGkM finds both short-term dynamic clusters and a “long-lived” partition of vertices within a network whose topology is evolving over time; we first introduced this technique in a recent conference paper. Here, we update our algorithm with a more efficient relaxation scheme, provide additional theoretical results, compare its performance to several other methods, and demonstrate its capabilities on real, diverse datasets. We construct a theoretical foundation to distinguish STGkM from connected components and static clustering and prove results for the stochastic setting for the first time. In addition to our previous experiments on the United States House of Representatives dataset, we report new state-of-the-art empirical results on a dynamic scientific citation network and Reddit dataset. These findings demonstrate that STGkM is accurate, efficient, informative, and operates well in diverse settings. Finally, as previously noted, one of the main advantages of STGkM is that it has only one required parameter: k, the number of clusters; we therefore include an extended analysis of the range of this parameter and guidance on selecting its optimal value. Our data and code are available on Github; see: https://github.com/dynestic/stgkm.Author summary: With the explosion of data about the world around us, we must constantly develop new ways of studying datasets. One popular method for analysis is k-means, which can identify clusters of related objects based on shared characteristics. Traditionally, k-means worked over sets of objects (e.g. animal subjects in a biology study) for which it was possible to define a list of consistent, numerical, and complete “features” or pieces of information (like age or weight). Over the years, this algorithm has been adapted for a variety of datasets and implemented in efficient code libraries. Our paper builds on this vast literature to extend k-means to the setting of networks that change over time, and we provide a practical and efficient implementation of it for real-world usage. We show that our new algorithm works on datasets like the United States House of Representatives roll call votes, citation networks from major publications, and a sample of Reddit posts. We also provide formal mathematical proofs and demonstrate the theoretical soundness of our technique.

Suggested Citation

  • Devavrat Vivek Dabke & Olga Dorabiala, 2024. "Vertex clustering in diverse dynamic networks," PLOS Complex Systems, Public Library of Science, vol. 1(4), pages 1-29, December.
  • Handle: RePEc:plo:pcsy00:0000023
    DOI: 10.1371/journal.pcsy.0000023
    as

    Download full text from publisher

    File URL: https://journals.plos.org/complexsystems/article?id=10.1371/journal.pcsy.0000023
    Download Restriction: no

    File URL: https://journals.plos.org/complexsystems/article/file?id=10.1371/journal.pcsy.0000023&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcsy.0000023?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcsy00:0000023. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: complexsystem (email available below). General contact details of provider: https://journals.plos.org/complexsystems/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.