IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i17p2156-d628910.html
   My bibliography  Save this article

Evaluation of Clustering Algorithms on HPC Platforms

Author

Listed:
  • Juan M. Cebrian

    (Computer Engineering Department (DITEC), University of Murcia, 30100 Murcia, Spain)

  • Baldomero Imbernón

    (Computer Science Department, Universidad Católica de Murcia (UCAM), 30107 Murcia, Spain)

  • Jesús Soto

    (Computer Science Department, Universidad Católica de Murcia (UCAM), 30107 Murcia, Spain)

  • José M. Cecilia

    (Computer Engineering Department (DISCA), Universitat Politécnica de Valéncia (UPV), 46022 Valencia, Spain)

Abstract

Clustering algorithms are one of the most widely used kernels to generate knowledge from large datasets. These algorithms group a set of data elements (i.e., images, points, patterns, etc.) into clusters to identify patterns or common features of a sample. However, these algorithms are very computationally expensive as they often involve the computation of expensive fitness functions that must be evaluated for all points in the dataset. This computational cost is even higher for fuzzy methods, where each data point may belong to more than one cluster. In this paper, we evaluate different parallelisation strategies on different heterogeneous platforms for fuzzy clustering algorithms typically used in the state-of-the-art such as the Fuzzy C-means (FCM), the Gustafson–Kessel FCM (GK-FCM) and the Fuzzy Minimals (FM). The experimental evaluation includes performance and energy trade-offs. Our results show that depending on the computational pattern of each algorithm, their mathematical foundation and the amount of data to be processed, each algorithm performs better on a different platform.

Suggested Citation

  • Juan M. Cebrian & Baldomero Imbernón & Jesús Soto & José M. Cecilia, 2021. "Evaluation of Clustering Algorithms on HPC Platforms," Mathematics, MDPI, vol. 9(17), pages 1-20, September.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:17:p:2156-:d:628910
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/17/2156/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/17/2156/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Ruiz de la Hermosa González-Carrato, Raúl, 2018. "Wind farm monitoring using Mahalanobis distance and fuzzy clustering," Renewable Energy, Elsevier, vol. 123(C), pages 526-540.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Juan Izquierdo & Adolfo Crespo Márquez & Jone Uribetxebarria & Asier Erguido, 2019. "Framework for Managing Maintenance of Wind Farms Based on a Clustering Approach and Dynamic Opportunistic Maintenance," Energies, MDPI, vol. 12(11), pages 1-17, May.
    2. Jorge Maldonado-Correa & Sergio Martín-Martínez & Estefanía Artigao & Emilio Gómez-Lázaro, 2020. "Using SCADA Data for Wind Turbine Condition Monitoring: A Systematic Literature Review," Energies, MDPI, vol. 13(12), pages 1-21, June.
    3. Francisco Bilendo & Angela Meyer & Hamed Badihi & Ningyun Lu & Philippe Cambron & Bin Jiang, 2022. "Applications and Modeling Techniques of Wind Turbine Power Curve for Wind Farms—A Review," Energies, MDPI, vol. 16(1), pages 1-38, December.
    4. Yuri Merizalde & Luis Hernández-Callejo & Oscar Duque-Perez & Víctor Alonso-Gómez, 2019. "Maintenance Models Applied to Wind Turbines. A Comprehensive Overview," Energies, MDPI, vol. 12(2), pages 1-41, January.
    5. Cheng Xiao & Zuojun Liu & Tieling Zhang & Lei Zhang, 2019. "On Fault Prediction for Wind Turbine Pitch System Using Radar Chart and Support Vector Machine Approach," Energies, MDPI, vol. 12(14), pages 1-18, July.
    6. Bowen Jiang & Yuangang Li & Weixin Yang, 2020. "Evaluation and Treatment Analysis of Air Quality Including Particulate Pollutants: A Case Study of Shandong Province, China," IJERPH, MDPI, vol. 17(24), pages 1-24, December.
    7. Chen, Junsheng & Li, Jian & Chen, Weigen & Wang, Youyuan & Jiang, Tianyan, 2020. "Anomaly detection for wind turbines based on the reconstruction of condition parameters using stacked denoising autoencoders," Renewable Energy, Elsevier, vol. 147(P1), pages 1469-1480.
    8. Jani, Hardik K. & Kachhwaha, Surendra Singh & Nagababu, Garlapati & Das, Alok, 2022. "Temporal and spatial simultaneity assessment of wind-solar energy resources in India by statistical analysis and machine learning clustering approach," Energy, Elsevier, vol. 248(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:17:p:2156-:d:628910. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.