IDEAS home Printed from https://ideas.repec.org/p/aiz/louvar/2020033.html
   My bibliography  Save this paper

AdaCLV for Interpretable Variable Clustering and Dimensionality Reduction of Spectroscopic Data

Author

Listed:
  • Marion, Rebecca

    (Université catholique de Louvain, LIDAM/ISBA, Belgium)

  • Govaerts, Bernadette

    (Université catholique de Louvain, LIDAM/ISBA, Belgium)

  • von Sachs, Rainer

    (Université catholique de Louvain, LIDAM/ISBA, Belgium)

Abstract

This paper introduces a new method, Adaptive Clustering around Latent Variables (AdaCLV), for simultaneous dimensionality reduction and variable clustering, the partitioning of variables into groups. This unsupervised method is particularly well suited for the exploration of spectroscopic datasets, such as Nuclear Magnetic Resonance (NMR) spectra, and can be used for the identification of potential biomarkers. AdaCLV is inspired by existing multivariate methods from the Clustering around Latent Variables (CLV) family, but it offers several key advantages with respect to these methods. First, AdaCLV allows variables to belong to multiple clusters with varying degrees. A cluster membership degree is estimated for each variable and cluster, and these memberships are used to define non-orthogonal latent variables that summarize the clusters. As a result, the clusters and latent variables identified by AdaCLV are more interpretable and representative of spectroscopic data, where peaks for different molecules (i.e. variable clusters) may overlap and variables within a cluster have different degrees of importance. Second, while the performance of existing methods depends greatly on hyperparameter selection, AdaCLV is less sensitive to its hyperparameters, adapting to the clustering structure present in the data. This paper compares AdaCLV with existing CLV methods and other competitors in experiments involving real and semi-artificial NMR spectra. AdaCLV is shown to be more robust to hyperparameter choice and to have better precision than the other methods, for all cluster numbers, sample sizes and levels of signal tested, while achieving a comparable level of recall.

Suggested Citation

  • Marion, Rebecca & Govaerts, Bernadette & von Sachs, Rainer, 2020. "AdaCLV for Interpretable Variable Clustering and Dimensionality Reduction of Spectroscopic Data," LIDAM Reprints ISBA 2020033, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
  • Handle: RePEc:aiz:louvar:2020033
    DOI: https://doi.org/10.1016/j.chemolab.2020.104169
    Note: In: Chemometrics and Intelligent Laboratory Systems - Vol. 206 (2020)
    as

    Download full text from publisher

    To our knowledge, this item is not available for download. To find whether it is available, there are three options:
    1. Check below whether another version of this item is available online.
    2. Check on the provider's web page whether it is in fact available.
    3. Perform a search for a similarly titled item that would be available.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Marion, Rebecca & Lederer, Johannes & Govaerts, Bernadette & von Sachs, Rainer, 2021. "VC-PCR: A Prediction Method based on Supervised Variable Selection and Clustering," LIDAM Discussion Papers ISBA 2021040, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:aiz:louvar:2020033. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Nadja Peiffer (email available below). General contact details of provider: https://edirc.repec.org/data/isuclbe.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.