IDEAS home Printed from https://ideas.repec.org/p/hal/wpaper/hal-05081264.html
   My bibliography  Save this paper

ICS for complex data with application to outlier detection for density data

Author

Listed:
  • Camille Mondon

    (TSE-R - Toulouse School of Economics - UT Capitole - Université Toulouse Capitole - UT - Université de Toulouse - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement)

  • Thi Huong Trinh

    (Thuongmai University - Partenaires INRAE)

  • Anne Ruiz-Gazen

    (TSE-R - Toulouse School of Economics - UT Capitole - Université Toulouse Capitole - UT - Université de Toulouse - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement)

  • Christine Thomas-Agnan

    (TSE-R - Toulouse School of Economics - UT Capitole - Université Toulouse Capitole - UT - Université de Toulouse - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement)

Abstract

Invariant coordinate selection (ICS) is a dimension reduction method, used as a preliminary step for clustering and outlier detection. It has been primarily applied to multivariate data. This work introduces a coordinate-free definition of ICS in an abstract Euclidean space and extends the method to complex data. Functional and distributional data are preprocessed into a finite-dimensional subspace. For example, in the framework of Bayes Hilbert spaces, distributional data are smoothed into compositional spline functions through the Maximum Penalised Likelihood method. We describe an outlier detection procedure for complex data and study the impact of some preprocessing parameters on the results. We compare our approach with other outlier detection methods through simulations, producing promising results in scenarios with a low proportion of outliers. ICS allows detecting abnormal climate events in a sample of daily maximum temperature distributions recorded across the provinces of Northern Vietnam between 1987 and 2016.

Suggested Citation

  • Camille Mondon & Thi Huong Trinh & Anne Ruiz-Gazen & Christine Thomas-Agnan, 2025. "ICS for complex data with application to outlier detection for density data," Working Papers hal-05081264, HAL.
  • Handle: RePEc:hal:wpaper:hal-05081264
    Note: View the original document on HAL open archive server: https://hal.science/hal-05081264v1
    as

    Download full text from publisher

    File URL: https://hal.science/hal-05081264v1/document
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Archimbaud, Aurore & Nordhausen, Klaus & Ruiz-Gazen, Anne, 2018. "ICS for multivariate outlier detection with application to quality control," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 184-199.
    2. Tyler, David E., 2010. "A note on multivariate location and scatter statistics for sparse data sets," Statistics & Probability Letters, Elsevier, vol. 80(17-18), pages 1409-1413, September.
    3. Ruiz-Gazen, Anne & Thomas-Agnan, Christine & Laurent, Thibault & Mondon, Camille, 2022. "Detecting outliers in compositional data using Invariant Coordinate Selection," TSE Working Papers 22-1320, Toulouse School of Economics (TSE).
    4. Nordhausen, Klaus & Ruiz-Gazen, Anne, 2022. "On the usage of joint diagonalization in multivariate statistics," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    5. Virta, Joni & Li, Bing & Nordhausen, Klaus & Oja, Hannu, 2020. "Independent component analysis for multivariate functional data," Journal of Multivariate Analysis, Elsevier, vol. 176(C).
    6. Loperfido, Nicola, 2021. "Some theoretical properties of two kurtosis matrices, with application to invariant coordinate selection," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
    7. J. Machalová & K. Hron & G.S. Monti, 2016. "Preprocessing of centred logratio transformed density functions using smoothing splines," Journal of Applied Statistics, Taylor & Francis Journals, vol. 43(8), pages 1419-1435, June.
    8. David E. Tyler & Frank Critchley & Lutz Dümbgen & Hannu Oja, 2009. "Invariant co‐ordinate selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 549-592, June.
    9. Archimbaud, Aurore & Boulfani, Feriel & Gendre, Xavier & Nordhausen, Klaus & Ruiz-Gazen, Anne & Virta, Joni, 2025. "ICS for multivariate functional anomaly detection with applications to predictive maintenance and quality control," Econometrics and Statistics, Elsevier, vol. 33(C), pages 282-303.
    10. Thomas-Agnan, Christine & Simioni, Michel & Trinh, Thi-Huong, 2023. "Discrete and smooth scalar-on-density compositional regression for assessing the impact of climate change on rice yield in Vietnam," TSE Working Papers 23-1410, Toulouse School of Economics (TSE), revised Jun 2025.
    11. Aurore Archimbaud & Zlatko Drmac & Klaus Nordhausen & Una Radojicic & Anne Ruiz-Gazen, 2023. "Numerical Considerations and a New Implementation for Invariant Coordinate Selection," Post-Print hal-04038657, HAL.
    12. Dai, Wenlin & Mrkvička, Tomáš & Sun, Ying & Genton, Marc G., 2020. "Functional outlier detection and taxonomy by sequential transformations," Computational Statistics & Data Analysis, Elsevier, vol. 149(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Archimbaud, Aurore & Boulfani, Feriel & Gendre, Xavier & Nordhausen, Klaus & Ruiz-Gazen, Anne & Virta, Joni, 2025. "ICS for multivariate functional anomaly detection with applications to predictive maintenance and quality control," Econometrics and Statistics, Elsevier, vol. 33(C), pages 282-303.
    2. Nordhausen, Klaus & Ruiz-Gazen, Anne, 2022. "On the usage of joint diagonalization in multivariate statistics," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    3. Ruiz-Gazen, Anne & Thomas-Agnan, Christine & Laurent, Thibault & Mondon, Camille, 2022. "Detecting outliers in compositional data using Invariant Coordinate Selection," TSE Working Papers 22-1320, Toulouse School of Economics (TSE).
    4. Loperfido, Nicola, 2021. "Some theoretical properties of two kurtosis matrices, with application to invariant coordinate selection," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
    5. Fischer, Daniel & Berro, Alain & Nordhausen, Klaus & Ruiz-Gazen, Anne, 2019. "REPPlab: An R package for detecting clusters and outliers using exploratory projection pursuit," TSE Working Papers 19-1001, Toulouse School of Economics (TSE).
    6. Dominique Guégan & Matteo Iacopini, 2018. "Nonparameteric forecasting of multivariate probability density functions," Documents de travail du Centre d'Economie de la Sorbonne 18012, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    7. Cristian Preda & Quentin Grimonprez & Vincent Vandewalle, 2021. "Categorical Functional Data Analysis. The cfda R Package," Mathematics, MDPI, vol. 9(23), pages 1-31, November.
    8. Virta, J., 2016. "One-step M-estimates of scatter and the independence property," Statistics & Probability Letters, Elsevier, vol. 110(C), pages 133-136.
    9. Moritz Herrmann & Fabian Scheipl, 2021. "A Geometric Perspective on Functional Outlier Detection," Stats, MDPI, vol. 4(4), pages 1-41, November.
    10. Pini, Alessia & Stamm, Aymeric & Vantini, Simone, 2018. "Hotelling’s T2 in separable Hilbert spaces," Journal of Multivariate Analysis, Elsevier, vol. 167(C), pages 284-305.
    11. Javed, Farrukh & Loperfido, Nicola & Mazur, Stepan, 2025. "The method of moments for multivariate random sums in the Poisson-Skew-Normal case," Statistics & Probability Letters, Elsevier, vol. 219(C).
    12. Karel Hron & Jitka Machalová & Alessandra Menafoglio, 2023. "Bivariate densities in Bayes spaces: orthogonal decomposition and spline representation," Statistical Papers, Springer, vol. 64(5), pages 1629-1667, October.
    13. Ojo, Oluwasegun Taiwo & Fernández Anta, Antonio & Genton, Marc G. & Lillo Rodríguez, Rosa Elvira, 2022. "Multivariate Functional Outlier Detection using the FastMUOD Indices," DES - Working Papers. Statistics and Econometrics. WS 35665, Universidad Carlos III de Madrid. Departamento de Estadística.
    14. Zhong, Rou & Liu, Shishi & Li, Haocheng & Zhang, Jingxiao, 2022. "Robust functional principal component analysis for non-Gaussian longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    15. Nordhausen, Klaus & Oja, Hannu & Tyler, David E., 2022. "Asymptotic and bootstrap tests for subspace dimension," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    16. Peña, Daniel & Prieto, Francisco J. & Rendón, Carolina, 2014. "Independent components techniques based on kurtosis for functional data analysis," DES - Working Papers. Statistics and Econometrics. WS ws141006, Universidad Carlos III de Madrid. Departamento de Estadística.
    17. Thomas-Agnan, Christine & Simioni, Michel & Trinh, Thi-Huong, 2023. "Discrete and smooth scalar-on-density compositional regression for assessing the impact of climate change on rice yield in Vietnam," TSE Working Papers 23-1410, Toulouse School of Economics (TSE), revised Jun 2025.
    18. Dargel, Lukas & Thomas-Agnan, Christine, 2024. "Pairwise share ratio interpretations of compositional regression models," Computational Statistics & Data Analysis, Elsevier, vol. 195(C).
    19. Taskinen, Sara & Koch, Inge & Oja, Hannu, 2012. "Robustifying principal component analysis with spatial sign vectors," Statistics & Probability Letters, Elsevier, vol. 82(4), pages 765-774.
    20. Talská, R. & Menafoglio, A. & Machalová, J. & Hron, K. & Fišerová, E., 2018. "Compositional regression with functional response," Computational Statistics & Data Analysis, Elsevier, vol. 123(C), pages 66-85.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:wpaper:hal-05081264. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.