IDEAS home Printed from https://ideas.repec.org/a/jss/jstsof/v053i09.html
   My bibliography  Save this article

fastcluster: Fast Hierarchical, Agglomerative Clustering Routines for R and Python

Author

Listed:
  • Müllner, Daniel

Abstract

The fastcluster package is a C++ library for hierarchical, agglomerative clustering. It provides a fast implementation of the most efficient, current algorithms when the input is a dissimilarity index. Moreover, it features memory-saving routines for hierarchical clustering of vector data. It improves both asymptotic time complexity (in most cases) and practical performance (in all cases) compared to the existing implementations in standard software: several R packages, MATLAB, Mathematica, Python with SciPy.

Suggested Citation

  • Müllner, Daniel, 2013. "fastcluster: Fast Hierarchical, Agglomerative Clustering Routines for R and Python," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 53(i09).
  • Handle: RePEc:jss:jstsof:v:053:i09
    DOI: http://hdl.handle.net/10.18637/jss.v053.i09
    as

    Download full text from publisher

    File URL: https://www.jstatsoft.org/index.php/jss/article/view/v053i09/v53i09.pdf
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v053i09/fastcluster_1.1.11.tar.gz
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v053i09/v53i09.py.zip
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v053i09/v53i09-benchmarks.zip
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v053i09/iris.txt
    Download Restriction: no

    File URL: https://libkey.io/http://hdl.handle.net/10.18637/jss.v053.i09?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. J. C. Gower & G. J. S. Ross, 1969. "Minimum Spanning Trees and Single Linkage Cluster Analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 18(1), pages 54-64, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Bahman Panahi & Mohammad Farhadian & Mohammad Amin Hejazi, 2020. "Systems biology approach identifies functional modules and regulatory hubs related to secondary metabolites accumulation after transition from autotrophic to heterotrophic growth condition in microalg," PLOS ONE, Public Library of Science, vol. 15(2), pages 1-15, February.
    2. Edrisse Chermak & Renato De Donato & Marc F Lensink & Andrea Petta & Luigi Serra & Vittorio Scarano & Luigi Cavallo & Romina Oliva, 2016. "Introducing a Clustering Step in a Consensus Approach for the Scoring of Protein-Protein Docking Models," PLOS ONE, Public Library of Science, vol. 11(11), pages 1-15, November.
    3. Benedict Anchang & Mary T Do & Xi Zhao & Sylvia K Plevritis, 2014. "CCAST: A Model-Based Gating Strategy to Isolate Homogeneous Subpopulations in a Heterogeneous Population of Single Cells," PLOS Computational Biology, Public Library of Science, vol. 10(7), pages 1-14, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Beibei Zhang & Rong Chen, 2018. "Nonlinear Time Series Clustering Based on Kolmogorov-Smirnov 2D Statistic," Journal of Classification, Springer;The Classification Society, vol. 35(3), pages 394-421, October.
    2. Sung-Soo Kim & W. Krzanowski, 2007. "Detecting multiple outliers in linear regression using a cluster method combined with graphical visualization," Computational Statistics, Springer, vol. 22(1), pages 109-119, April.
    3. Ahuja, Ravindra K., 1956-, 1992. "Applications of network optimization," Working papers 3458-92., Massachusetts Institute of Technology (MIT), Sloan School of Management.
    4. Tokuda, Eric K. & Comin, Cesar H. & Costa, Luciano da F., 2022. "Revisiting agglomerative clustering," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 585(C).
    5. Modarres, Reza, 2014. "On the interpoint distances of Bernoulli vectors," Statistics & Probability Letters, Elsevier, vol. 84(C), pages 215-222.
    6. Kirschstein, Thomas & Liebscher, Steffen & Becker, Claudia, 2013. "Robust estimation of location and scatter by pruning the minimum spanning tree," Journal of Multivariate Analysis, Elsevier, vol. 120(C), pages 173-184.
    7. M. Raddant & T. Di Matteo, 2023. "A look at financial dependencies by means of econophysics and financial economics," Journal of Economic Interaction and Coordination, Springer;Society for Economic Science with Heterogeneous Interacting Agents, vol. 18(4), pages 701-734, October.
    8. Cheong, Siew Ann & Fornia, Robert Paulo & Lee, Gladys Hui Ting & Kok, Jun Liang & Yim, Woei Shyr & Xu, Danny Yuan & Zhang, Yiting, 2011. "The Japanese economy in crises: A time series segmentation study," Economics Discussion Papers 2011-24, Kiel Institute for the World Economy (IfW Kiel).
    9. Coletti, Paolo, 2016. "Comparing minimum spanning trees of the Italian stock market using returns and volumes," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 463(C), pages 246-261.
    10. Jean-Pierre Barthélemy & Bruno Leclerc & Bernard Monjardet, 1986. "On the use of ordered sets in problems of comparison and consensus of classifications," Journal of Classification, Springer;The Classification Society, vol. 3(2), pages 187-224, September.
    11. Sergio Scippacercola, 2011. "The Factorial Minimum Spanning Tree as a Reference for a Synthetic Index of Complex Phenomena," Journal of Classification, Springer;The Classification Society, vol. 28(1), pages 21-37, April.
    12. T. Ojasoo & J. C. Doré, 1999. "Citation bias in medical journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 45(1), pages 81-94, May.
    13. Lawrence Hubert, 1974. "Some applications of graph theory to clustering," Psychometrika, Springer;The Psychometric Society, vol. 39(3), pages 283-309, September.
    14. Bruno Leclerc, 1995. "Minimum spanning trees for tree metrics: abridgements and adjustments," Journal of Classification, Springer;The Classification Society, vol. 12(2), pages 207-241, September.
    15. Nenad Mladenovic & Pierre Hansen & Jack Brimberg, 2013. "Sequential clustering with radius and split criteria," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 21(1), pages 95-115, June.
    16. Zhimei Lei & Kuo-Jui Wu & Li Cui & Ming K Lim, 2018. "A Hybrid Approach to Explore the Risk Dependency Structure among Agribusiness Firms," Sustainability, MDPI, vol. 10(2), pages 1-17, February.
    17. Raymond, Ben & Hosie, Graham, 2009. "Network-based exploration and visualisation of ecological data," Ecological Modelling, Elsevier, vol. 220(5), pages 673-683.
    18. Giorgio Fagiolo, 2010. "The international-trade network: gravity equations and topological properties," Journal of Economic Interaction and Coordination, Springer;Society for Economic Science with Heterogeneous Interacting Agents, vol. 5(1), pages 1-25, June.
    19. Unknown, 1996. "Proceedings of a workshop held at Northern Territory University, 6-7 June 1996: Trochus: Status, Hatchery Practice and Nutrition," ACIAR Proceedings Series 135188, Australian Centre for International Agricultural Research.
    20. Eden, Colin, 2004. "Analyzing cognitive maps to help structure issues or problems," European Journal of Operational Research, Elsevier, vol. 159(3), pages 673-686, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:jss:jstsof:v:053:i09. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F. Baum (email available below). General contact details of provider: http://www.jstatsoft.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.