IDEAS home Printed from https://ideas.repec.org/a/plo/pcsy00/0000009.html

Well-connectedness and community detection

Author

Listed:
  • Minhyuk Park
  • Yasamin Tabatabaee
  • Vikram Ramavarapu
  • Baqiao Liu
  • Vidya Kamath Pailodi
  • Rajiv Ramachandran
  • Dmitriy Korobskiy
  • Fabio Ayres
  • George Chacko
  • Tandy Warnow

Abstract

Community detection methods help reveal the meso-scale structure of complex networks. Integral to detecting communities is the expectation that communities in a network are edge-dense and “well-connected”. Surprisingly, we find that five different community detection methods–the Leiden algorithm optimizing the Constant Potts Model, the Leiden algorithm optimizing modularity, Infomap, Markov Cluster (MCL), and Iterative k-core (IKC)–identify communities that fail even a mild requirement for well-connectedness. To address this issue, we have developed the Connectivity Modifier (CM), which iteratively removes small edge cuts and re-clusters until communities are well-connected according to a user-specified criterion. We tested CM on real-world networks ranging in size from approximately 35,000 to 75,000,000 nodes. Post-processing of the output of community detection methods by CM resulted in a reduction in node coverage. Results on synthetic networks show that the CM algorithm generally maintains or improves accuracy in recovering true communities. This study underscores the importance of network clusterability–the fraction of a network that exhibits community structure–and the need for more models of community structure where networks contain nodes that are not assigned to communities. In summary, we address well-connectedness as an important aspect of clustering and present a scalable open-source tool for well-connected clusters.Author summary: Community detection—a term interchangeably used with clustering—is used in network analysis. An expectation is that communities or clusters should be dense and well-connected. However, density is separable from well-connectedness, as clusters may be dense without being well-connected. Our study demonstrates that several clustering algorithms generate clusters that are not well-connected according to a mild standard we impose. To address this issue, we developed the Connectivity Modifier (CM), a tool to allow users to specify a threshold for well-connectedness and enforce it in the output of multiple community detection methods.

Suggested Citation

  • Minhyuk Park & Yasamin Tabatabaee & Vikram Ramavarapu & Baqiao Liu & Vidya Kamath Pailodi & Rajiv Ramachandran & Dmitriy Korobskiy & Fabio Ayres & George Chacko & Tandy Warnow, 2024. "Well-connectedness and community detection," PLOS Complex Systems, Public Library of Science, vol. 1(3), pages 1-25, November.
  • Handle: RePEc:plo:pcsy00:0000009
    DOI: 10.1371/journal.pcsy.0000009
    as

    Download full text from publisher

    File URL: https://journals.plos.org/complexsystems/article?id=10.1371/journal.pcsy.0000009
    Download Restriction: no

    File URL: https://journals.plos.org/complexsystems/article/file?id=10.1371/journal.pcsy.0000009&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcsy.0000009?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Kamiński, Bogumił & Prałat, Paweł & Théberge, François, 2021. "Artificial Benchmark for Community Detection (ABCD)—Fast random graph model with community structure," Network Science, Cambridge University Press, vol. 9(2), pages 153-178, June.
    2. Michal Brzezinski, 2015. "Power laws in citation distributions: evidence from Scopus," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(1), pages 213-228, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bruno Michel Roman Pais Seles & Janaina Mascarenhas & Ana Beatriz Lopes de Sousa Jabbour & Adriana Hoffman Trevisan, 2022. "Smoothing the circular economy transition: The role of resources and capabilities enablers," Business Strategy and the Environment, Wiley Blackwell, vol. 31(4), pages 1814-1837, May.
    2. de Camargo Fiorini, Paula & Roman Pais Seles, Bruno Michel & Chiappetta Jabbour, Charbel Jose & Barberio Mariano, Enzo & de Sousa Jabbour, Ana Beatriz Lopes, 2018. "Management theory and big data literature: From a review to a research agenda," International Journal of Information Management, Elsevier, vol. 43(C), pages 112-129.
    3. R. Basurto-Flores & L. Guzmán-Vargas & S. Velasco & A. Medina & A. Calvo Hernandez, 2018. "On entropy research analysis: cross-disciplinary knowledge transfer," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 123-139, October.
    4. Brito, Ricardo & Rodríguez-Navarro, Alonso, 2018. "Research assessment by percentile-based double rank analysis," Journal of Informetrics, Elsevier, vol. 12(1), pages 315-329.
    5. Virginia Milone & Antonio Fusco & Angelamaria De Feo & Marco Tatullo, 2024. "Clinical Impact of “Real World Data” and Blockchain on Public Health: A Scoping Review," IJERPH, MDPI, vol. 21(1), pages 1-14, January.
    6. Emilio Abad-Segura & Mariana-Daniela González-Zamar & Eloy López-Meneses & Esteban Vázquez-Cano, 2020. "Financial Technology: Review of Trends, Approaches and Management," Mathematics, MDPI, vol. 8(6), pages 1-37, June.
    7. Georgios Stoupas & Antonis Sidiropoulos & Antonia Gogoglou & Dimitrios Katsaros & Yannis Manolopoulos, 2018. "Rainbow ranking: an adaptable, multidimensional ranking method for publication sets," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(1), pages 147-160, July.
    8. Michał Żemła, 2021. "Winter Sports Resorts and Natural Environment—Systematic Literature Review Presenting Interactions between Them," Sustainability, MDPI, vol. 13(2), pages 1-17, January.
    9. Rab Nawaz Lodhi & Muhammad Asif & Aliya Abdikarimova & Muhammad Farrukh Shahzad, 2026. "Impact of innovation and sustainability on green entrepreneurship: a bibliometric exploration," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 28(1), pages 1369-1394, January.
    10. Liu, Meijun & Hu, Xiao, 2022. "Movers’ advantages: The effect of mobility on scientists’ productivity and collaboration," Journal of Informetrics, Elsevier, vol. 16(3).
    11. Inga Ivanova, 2025. "Operationalization of the theory of meaning in inter-social communications and its applications," Scientometrics, Springer;Akadémiai Kiadó, vol. 130(6), pages 3109-3126, June.
    12. Kaihan Yang & Ai Chin Thoo, 2023. "Visualising the Knowledge Domain of Reverse Logistics and Sustainability Performance: Scientometric Mapping Based on VOSviewer and CiteSpace," Sustainability, MDPI, vol. 15(2), pages 1-23, January.
    13. Marcelo Mendoza, 2021. "Differences in Citation Patterns across Areas, Article Types and Age Groups of Researchers," Publications, MDPI, vol. 9(4), pages 1-23, October.
    14. Nowak, Przemysław & Santolini, Marc & Singh, Chakresh & Siudem, Grzegorz & Tupikina, Liubov, 2024. "Beyond Zipf’s law: Exploring the discrete generalized beta distribution in open-source repositories," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 649(C).
    15. Yue Chen & Zhiqi Wang & Kai Song & Deming Lin & Hildrun Kretschmer, 2021. "Growing with collaboration: footprint of WISE Lab," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 6147-6167, July.
    16. Federica Caboni, 2021. "The Use of Digital Technology to Reshape the Retail Store," International Journal of Business and Management, Canadian Center of Science and Education, vol. 15(1), pages 149-149, July.
    17. Paula Camargo Fiorini & Charbel Jose Chiappetta Jabbour & Ana Beatriz Lopes de Sousa Jabbour & Gary Ramsden, 2022. "The human side of humanitarian supply chains: a research agenda and systematization framework," Annals of Operations Research, Springer, vol. 319(1), pages 911-936, December.
    18. Unnikrishnan Nair, N. & Vineshkumar, B., 2022. "Modelling informetric data using quantile functions," Journal of Informetrics, Elsevier, vol. 16(2).
    19. Guillermo Armando Ronda-Pupo & J. Sylvan Katz, 2017. "The scaling relationship between degree centrality of countries and their citation-based performance on Management Information Systems," Scientometrics, Springer;Akadémiai Kiadó, vol. 112(3), pages 1285-1299, September.
    20. Lopes de Sousa Jabbour, Ana Beatriz & Ndubisi, Nelson Oly & Roman Pais Seles, Bruno Michel, 2020. "Sustainable development in Asian manufacturing SMEs: Progress and directions," International Journal of Production Economics, Elsevier, vol. 225(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcsy00:0000009. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: complexsystem (email available below). General contact details of provider: https://journals.plos.org/complexsystems/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.