IDEAS home Printed from https://ideas.repec.org/a/plo/pcsy00/0000009.html
   My bibliography  Save this article

Well-connectedness and community detection

Author

Listed:
  • Minhyuk Park
  • Yasamin Tabatabaee
  • Vikram Ramavarapu
  • Baqiao Liu
  • Vidya Kamath Pailodi
  • Rajiv Ramachandran
  • Dmitriy Korobskiy
  • Fabio Ayres
  • George Chacko
  • Tandy Warnow

Abstract

Community detection methods help reveal the meso-scale structure of complex networks. Integral to detecting communities is the expectation that communities in a network are edge-dense and “well-connected”. Surprisingly, we find that five different community detection methods–the Leiden algorithm optimizing the Constant Potts Model, the Leiden algorithm optimizing modularity, Infomap, Markov Cluster (MCL), and Iterative k-core (IKC)–identify communities that fail even a mild requirement for well-connectedness. To address this issue, we have developed the Connectivity Modifier (CM), which iteratively removes small edge cuts and re-clusters until communities are well-connected according to a user-specified criterion. We tested CM on real-world networks ranging in size from approximately 35,000 to 75,000,000 nodes. Post-processing of the output of community detection methods by CM resulted in a reduction in node coverage. Results on synthetic networks show that the CM algorithm generally maintains or improves accuracy in recovering true communities. This study underscores the importance of network clusterability–the fraction of a network that exhibits community structure–and the need for more models of community structure where networks contain nodes that are not assigned to communities. In summary, we address well-connectedness as an important aspect of clustering and present a scalable open-source tool for well-connected clusters.Author summary: Community detection—a term interchangeably used with clustering—is used in network analysis. An expectation is that communities or clusters should be dense and well-connected. However, density is separable from well-connectedness, as clusters may be dense without being well-connected. Our study demonstrates that several clustering algorithms generate clusters that are not well-connected according to a mild standard we impose. To address this issue, we developed the Connectivity Modifier (CM), a tool to allow users to specify a threshold for well-connectedness and enforce it in the output of multiple community detection methods.

Suggested Citation

  • Minhyuk Park & Yasamin Tabatabaee & Vikram Ramavarapu & Baqiao Liu & Vidya Kamath Pailodi & Rajiv Ramachandran & Dmitriy Korobskiy & Fabio Ayres & George Chacko & Tandy Warnow, 2024. "Well-connectedness and community detection," PLOS Complex Systems, Public Library of Science, vol. 1(3), pages 1-25, November.
  • Handle: RePEc:plo:pcsy00:0000009
    DOI: 10.1371/journal.pcsy.0000009
    as

    Download full text from publisher

    File URL: https://journals.plos.org/complexsystems/article?id=10.1371/journal.pcsy.0000009
    Download Restriction: no

    File URL: https://journals.plos.org/complexsystems/article/file?id=10.1371/journal.pcsy.0000009&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcsy.0000009?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Kamiński, Bogumił & Prałat, Paweł & Théberge, François, 2021. "Artificial Benchmark for Community Detection (ABCD)—Fast random graph model with community structure," Network Science, Cambridge University Press, vol. 9(2), pages 153-178, June.
    2. Michal Brzezinski, 2015. "Power laws in citation distributions: evidence from Scopus," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(1), pages 213-228, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bruno Michel Roman Pais Seles & Janaina Mascarenhas & Ana Beatriz Lopes de Sousa Jabbour & Adriana Hoffman Trevisan, 2022. "Smoothing the circular economy transition: The role of resources and capabilities enablers," Business Strategy and the Environment, Wiley Blackwell, vol. 31(4), pages 1814-1837, May.
    2. de Camargo Fiorini, Paula & Roman Pais Seles, Bruno Michel & Chiappetta Jabbour, Charbel Jose & Barberio Mariano, Enzo & de Sousa Jabbour, Ana Beatriz Lopes, 2018. "Management theory and big data literature: From a review to a research agenda," International Journal of Information Management, Elsevier, vol. 43(C), pages 112-129.
    3. Virginia Milone & Antonio Fusco & Angelamaria De Feo & Marco Tatullo, 2024. "Clinical Impact of “Real World Data” and Blockchain on Public Health: A Scoping Review," IJERPH, MDPI, vol. 21(1), pages 1-14, January.
    4. Georgios Stoupas & Antonis Sidiropoulos & Antonia Gogoglou & Dimitrios Katsaros & Yannis Manolopoulos, 2018. "Rainbow ranking: an adaptable, multidimensional ranking method for publication sets," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(1), pages 147-160, July.
    5. Federica Caboni, 2021. "The Use of Digital Technology to Reshape the Retail Store," International Journal of Business and Management, Canadian Center of Science and Education, vol. 15(1), pages 149-149, July.
    6. Paula Camargo Fiorini & Charbel Jose Chiappetta Jabbour & Ana Beatriz Lopes de Sousa Jabbour & Gary Ramsden, 2022. "The human side of humanitarian supply chains: a research agenda and systematization framework," Annals of Operations Research, Springer, vol. 319(1), pages 911-936, December.
    7. Unnikrishnan Nair, N. & Vineshkumar, B., 2022. "Modelling informetric data using quantile functions," Journal of Informetrics, Elsevier, vol. 16(2).
    8. Lopes de Sousa Jabbour, Ana Beatriz & Ndubisi, Nelson Oly & Roman Pais Seles, Bruno Michel, 2020. "Sustainable development in Asian manufacturing SMEs: Progress and directions," International Journal of Production Economics, Elsevier, vol. 225(C).
    9. Abderahman Rejeb & Karim Rejeb & Steven J. Simske & John G. Keogh, 2022. "Blockchain technology in the smart city: a bibliometric review," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(5), pages 2875-2906, October.
    10. Payam Hanafizadeh & Ferdos Hatami Lankarani & Shahrokh Nikou, 2022. "Perspectives on management theory’s application in the internet of things research," Information Systems and e-Business Management, Springer, vol. 20(4), pages 749-787, December.
    11. Mike Thelwall, 2016. "Interpreting correlations between citation counts and other indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 108(1), pages 337-347, July.
    12. Solanki Gupta & Vivek Kumar Singh, 2024. "Distributional characteristics of Dimensions concepts: An Empirical Analysis using Zipf’s law," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(2), pages 1037-1053, February.
    13. Qingnan Xie & Richard B. Freeman, 2020. "The Contribution of Chinese Diaspora Researchers to Global Science and China's Catching Up in Scientific Research," NBER Working Papers 27169, National Bureau of Economic Research, Inc.
    14. Antonia Gogoglou & Antonis Sidiropoulos & Dimitrios Katsaros & Yannis Manolopoulos, 2017. "The fractal dimension of a citation curve: quantifying an individual’s scientific output using the geometry of the entire curve," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(3), pages 1751-1774, June.
    15. Marzieh Shahmandi & Paul Wilson & Mike Thelwall, 2020. "A new algorithm for zero-modified models applied to citation counts," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 993-1010, November.
    16. S. R. Goldberg & H. Anthony & T. S. Evans, 2015. "Modelling citation networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(3), pages 1577-1604, December.
    17. Zhichao Fang & Jonathan Dudek & Rodrigo Costas, 2020. "The stability of Twitter metrics: A study on unavailable Twitter mentions of scientific publications," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 71(12), pages 1455-1469, December.
    18. Betancourt, Nathan & Jochem, Torsten & Otner, Sarah M.G., 2023. "Standing on the shoulders of giants: How star scientists influence their coauthors," Research Policy, Elsevier, vol. 52(1).
    19. Gómez-Déniz, Emilio & Dorta-González, Pablo, 2024. "Modeling citation concentration through a mixture of Leimkuhler curves," Journal of Informetrics, Elsevier, vol. 18(2).
    20. Massimiliano Coda-Zabetta & Francesco Lissoni & Ernest Miguelez, 2024. "Star recruitment and internationalization effects: an analysis of the Alexander von Humboldt professorship programme," Economia e Politica Industriale: Journal of Industrial and Business Economics, Springer;Associazione Amici di Economia e Politica Industriale, vol. 51(3), pages 667-690, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcsy00:0000009. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: complexsystem (email available below). General contact details of provider: https://journals.plos.org/complexsystems/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.