IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0329254.html

Empirically calibrated simulations reveal the limits of phenotypic clustering algorithms for biodiversity assessment in data-scarce crops

Author

Listed:
  • Abdel Kader Naino Jika

Abstract

Clustering algorithms are widely used for phenotypic characterization and germplasm management, particularly in data-scarce crops such as neglected and underutilized species (NUS) that lack genomic resources. However, their performance under biologically realistic conditions remains poorly understood. Standard clustering methods commonly applied in crop research often assume distinct, isotropic, and homogeneous clusters, assumptions rarely satisfied in real-world phenotypic datasets. We developed a flexible and empirically calibrated simulation framework, using phenotypic data from West African fonio (Digitaria exilis), to benchmark the performance of eleven clustering algorithms under both idealized and realistic scenarios. Our simulations integrated heterogeneous trait distributions (normal, gamma), strong inter-trait correlations (up to r = –0.84), heteroscedasticity, and moderate population structure (mean Pst = 0.16 ± 0.001, achieved through iterative calibration). Each scenario was replicated 100 times, with clustering accuracy evaluated using external (ARI, NMI) and internal (Silhouette, Davies–Bouldin) validation metrics under standardized conditions. The results revealed consistently poor algorithm performance under realistic conditions (e.g., ARI

Suggested Citation

  • Abdel Kader Naino Jika, 2025. "Empirically calibrated simulations reveal the limits of phenotypic clustering algorithms for biodiversity assessment in data-scarce crops," PLOS ONE, Public Library of Science, vol. 20(12), pages 1-14, December.
  • Handle: RePEc:plo:pone00:0329254
    DOI: 10.1371/journal.pone.0329254
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0329254
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0329254&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0329254?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0329254. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.