IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0141756.html
   My bibliography  Save this article

Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering

Author

Listed:
  • Lerato Lerato
  • Thomas Niesler

Abstract

Agglomerative hierarchical clustering becomes infeasible when applied to large datasets due to its O(N2) storage requirements. We present a multi-stage agglomerative hierarchical clustering (MAHC) approach aimed at large datasets of speech segments. The algorithm is based on an iterative divide-and-conquer strategy. The data is first split into independent subsets, each of which is clustered separately. Thus reduces the storage required for sequential implementations, and allows concurrent computation on parallel computing hardware. The resultant clusters are merged and subsequently re-divided into subsets, which are passed to the following iteration. We show that MAHC can match and even surpass the performance of the exact implementation when applied to datasets of speech segments.

Suggested Citation

  • Lerato Lerato & Thomas Niesler, 2015. "Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering," PLOS ONE, Public Library of Science, vol. 10(10), pages 1-24, October.
  • Handle: RePEc:plo:pone00:0141756
    DOI: 10.1371/journal.pone.0141756
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0141756
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0141756&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0141756?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Roy Varshavsky & David Horn & Michal Linial, 2008. "Global Considerations in Hierarchical Clustering Reveal Meaningful Patterns in Data," PLOS ONE, Public Library of Science, vol. 3(5), pages 1-10, May.
    2. Jiandong Yin & Jiawen Yang & Qiyong Guo, 2014. "Evaluating the Feasibility of an Agglomerative Hierarchy Clustering Algorithm for the Automatic Detection of the Arterial Input Function Using DSC-MRI," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-9, June.
    3. Fionn Murtagh & Pierre Legendre, 2014. "Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?," Journal of Classification, Springer;The Classification Society, vol. 31(3), pages 274-295, October.
    4. William Day & Herbert Edelsbrunner, 1984. "Efficient algorithms for agglomerative hierarchical clustering methods," Journal of Classification, Springer;The Classification Society, vol. 1(1), pages 7-24, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Tianxiao Wang & Zhecheng Jing & Shupei Zhang & Chengqun Qiu, 2023. "Utilizing Principal Component Analysis and Hierarchical Clustering to Develop Driving Cycles: A Case Study in Zhenjiang," Sustainability, MDPI, vol. 15(6), pages 1-13, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Maksym Polyakov & Morteza Chalak & Md. Sayed Iftekhar & Ram Pandit & Sorada Tapsuwan & Fan Zhang & Chunbo Ma, 2018. "Authorship, Collaboration, Topics, and Research Gaps in Environmental and Resource Economics 1991–2015," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 71(1), pages 217-239, September.
    2. Giger, Markus & Mutea, Emily & Kiteme, Boniface & Eckert, Sandra & Anseeuw, Ward & Zaehringer, Julie G., 2020. "Large agricultural investments in Kenya’s Nanyuki Area: Inventory and analysis of business models," Land Use Policy, Elsevier, vol. 99(C).
    3. Walker, Nathan L. & Styles, David & Coughlan, Paul & Williams, A. Prysor, 2022. "Cross-sector sustainability benchmarking of major utilities in the United Kingdom," Utilities Policy, Elsevier, vol. 78(C).
    4. Abang Zainoren Abang Abdurahman & Syerina Azlin Md Nasir & Wan Fairos Wan Yaacob & Serah Jaya & Suhaili Mokhtar, 2021. "Spatio-Temporal Clustering of Sarawak Malaysia Total Protected Area Visitors," Sustainability, MDPI, vol. 13(21), pages 1-19, October.
    5. Danxue Fan & Meiyue Li, 2025. "Coupling and Coordinated Development Analysis of Digital Economy, Economic Resilience, and Ecological Protection," Sustainability, MDPI, vol. 17(9), pages 1-25, May.
    6. Mulu Abraha Woldegiorgis & Janet E. Hiller & Wubegzier Mekonnen & Jahar Bhowmik, 2018. "Disparities in maternal health services in sub-Saharan Africa," International Journal of Public Health, Springer;Swiss School of Public Health (SSPH+), vol. 63(4), pages 525-535, May.
    7. Monika Stanny & Łukasz Komorowski & Andrzej Rosner, 2021. "The Socio-Economic Heterogeneity of Rural Areas: Towards a Rural Typology of Poland," Energies, MDPI, vol. 14(16), pages 1-23, August.
    8. Anca Gabriela Ilie & Marinela Luminita Emanuela Zlatea & Cristina Negreanu & Dan Dumitriu & Alma Pentescu, 2023. "Reliance on Russian Federation Energy Imports and Renewable Energy in the European Union," The AMFITEATRU ECONOMIC journal, Academy of Economic Studies - Bucharest, Romania, vol. 25(64), pages 780-780, August.
    9. Jon Ellingsen & Vegard H. Larsen & Leif Anders Thorsrud, 2020. "News Media vs. FRED-MD for Macroeconomic Forecasting," CESifo Working Paper Series 8639, CESifo.
    10. Sokhna Dieng & Pierre Michel & Abdoulaye Guindo & Kankoe Sallah & El-Hadj Ba & Badara Cissé & Maria Patrizia Carrieri & Cheikh Sokhna & Paul Milligan & Jean Gaudart, 2020. "Application of Functional Data Analysis to Identify Patterns of Malaria Incidence, to Guide Targeted Control Strategies," IJERPH, MDPI, vol. 17(11), pages 1-23, June.
    11. Leila Fardeau & Eva Lelièvre & Loïc Trabut, 2023. "Complex households, a challenge for the study of families through census data," Working Papers 274, French Institute for Demographic Studies.
    12. Marco Cruz-Sandoval & Elisabet Roca & María Isabel Ortego, 2020. "Compositional Data Analysis Approach in the Measurement of Social-Spatial Segregation: Towards a Sustainable and Inclusive City," Sustainability, MDPI, vol. 12(10), pages 1-19, May.
    13. Yurij L. Katchanov & Yulia V. Markova, 2017. "The “space of physics journals”: topological structure and the Journal Impact Factor," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(1), pages 313-333, October.
    14. Xue Ding & Mengling Qin & Linsen Yin & Dayong Lv & Yao Bai, 2023. "Research on FinTech Talent Evaluation Index System and Recruitment Strategy: Evidence From Shanghai in China," SAGE Open, , vol. 13(4), pages 21582440231, November.
    15. Šubová, Nikola, 2022. "The Contribution of Energy Use and Production to Greenhouse Gas Emissions: Evidence from the Agriculture of European Countries," AGRIS on-line Papers in Economics and Informatics, Czech University of Life Sciences Prague, Faculty of Economics and Management, vol. 14(3), September.
    16. Babucea Ana-Gabriela, 2017. "Determinants Of The Recent Romanian Households' Financial Behaviour For Housing Loans - A Territorial Analysis At The Level Of Nuts 3 Regions," Annals - Economy Series, Constantin Brancusi University, Faculty of Economics, vol. 1, pages 71-80, December.
    17. William Day & Herbert Edelsbrunner, 1985. "Investigation of proportional link linkage clustering methods," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 239-254, December.
    18. Brian A Hoover & Marisol García-Reyes & Sonia D Batten & Chelle L Gentemann & William J Sydeman, 2021. "Spatio-temporal persistence of zooplankton communities in the Gulf of Alaska," PLOS ONE, Public Library of Science, vol. 16(1), pages 1-24, January.
    19. Xiao Li & Michele Guindani & Chaan S. Ng & Brian P. Hobbs, 2021. "A Bayesian nonparametric model for textural pattern heterogeneity," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(2), pages 459-480, March.
    20. Monika Khandelwal & Sabha Sheikh & Ranjeet Kumar Rout & Saiyed Umer & Saurav Mallik & Zhongming Zhao, 2022. "Unsupervised Learning for Feature Representation Using Spatial Distribution of Amino Acids in Aldehyde Dehydrogenase (ALDH2) Protein Sequences," Mathematics, MDPI, vol. 10(13), pages 1-20, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0141756. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.