IDEAS home Printed from https://ideas.repec.org/p/tse/wpaper/122892.html
   My bibliography  Save this paper

REPPlab: An R package for detecting clusters and outliers using exploratory projection pursuit

Author

Listed:
  • Fischer, Daniel
  • Berro, Alain
  • Nordhausen, Klaus
  • Ruiz-Gazen, Anne

Abstract

The R-package REPPlab is designed to explore multivariate data sets using one-dimensional unsupervised projection pursuit. It is useful as a preprocessing step to find clusters or as an outlier detection tool for multivariate data. Except from the packages tourr and rggobi, there is no implementation of exploratory projection pursuit tools available in R. REPPlab is an R interface for the Java program EPP-lab that implements four projection indices and three biologically inspired optimization algorithms. It also proposes new tools for plotting and combining the results and specific tools for outlier detection. The functionality of the package is illustrated through some simulations and using some real data.

Suggested Citation

  • Fischer, Daniel & Berro, Alain & Nordhausen, Klaus & Ruiz-Gazen, Anne, 2019. "REPPlab: An R package for detecting clusters and outliers using exploratory projection pursuit," TSE Working Papers 19-1001, Toulouse School of Economics (TSE).
  • Handle: RePEc:tse:wpaper:122892
    as

    Download full text from publisher

    File URL: https://www.tse-fr.eu/sites/default/files/TSE/documents/doc/wp/2019/wp_tse_1001.pdf
    File Function: Full Text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Wickham, Hadley & Cook, Dianne & Hofmann, Heike & Buja, Andreas, 2011. "tourr: An R Package for Exploring Multivariate Data with Projections," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i02).
    2. David E. Tyler & Frank Critchley & Lutz Dümbgen & Hannu Oja, 2009. "Invariant co‐ordinate selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 549-592, June.
    3. Huang, Bei & Cook, Dianne & Wickham, Hadley, 2012. "tourrGui: A gWidgets GUI for the Tour to Explore High-Dimensional Data Using Low-Dimensional Projections," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 49(i06).
    4. Nordhausen, Klaus & Oja, Hannu & Tyler, David E., 2008. "Tools for Exploring Multivariate Data: The Package ICS," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 28(i06).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel Fischer & Alain Berro & Klaus Nordhausen & Anne Ruiz-Gazen, 2021. "REPPlab: An R package for detecting clusters and outliers using exploratory projection pursuit," Post-Print hal-03548865, HAL.
    2. Alashwali, Fatimah & Kent, John T., 2016. "The use of a common location measure in the invariant coordinate selection and projection pursuit," Journal of Multivariate Analysis, Elsevier, vol. 152(C), pages 145-161.
    3. Dümbgen, Lutz & Nordhausen, Klaus & Schuhmacher, Heike, 2016. "New algorithms for M-estimation of multivariate scatter and location," Journal of Multivariate Analysis, Elsevier, vol. 144(C), pages 200-217.
    4. Archimbaud, Aurore & Nordhausen, Klaus & Ruiz-Gazen, Anne, 2018. "ICS for multivariate outlier detection with application to quality control," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 184-199.
    5. Nordhausen, Klaus & Oja, Hannu & Tyler, David E., 2022. "Asymptotic and bootstrap tests for subspace dimension," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    6. Nordhausen, Klaus & Ruiz-Gazen, Anne, 2022. "On the usage of joint diagonalization in multivariate statistics," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    7. Valero-Mora, Pedro M. & Ledesma, Ruben, 2012. "Graphical User Interfaces for R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 49(i01).
    8. Ilmonen, Pauliina, 2013. "On asymptotic properties of the scatter matrix based estimates for complex valued independent component analysis," Statistics & Probability Letters, Elsevier, vol. 83(4), pages 1219-1226.
    9. Ursula Laa & Dianne Cook & Andreas Buja & German Valencia, 2020. "Hole or grain? A Section Pursuit Index for Finding Hidden Structure in Multiple Dimensions," Monash Econometrics and Business Statistics Working Papers 17/20, Monash University, Department of Econometrics and Business Statistics.
    10. Ruiz-Gazen, Anne & Thomas-Agnan, Christine & Laurent, Thibault & Mondon, Camille, 2022. "Detecting outliers in compositional data using Invariant Coordinate Selection," TSE Working Papers 22-1320, Toulouse School of Economics (TSE).
    11. Nicola Loperfido, 2019. "Finite mixtures, projection pursuit and tensor rank: a triangulation," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 145-173, March.
    12. Klaus Nordhausen, 2014. "On robustifying some second order blind source separation methods for nonstationary time series," Statistical Papers, Springer, vol. 55(1), pages 141-156, February.
    13. Dürre, Alexander & Vogel, Daniel & Tyler, David E., 2014. "The spatial sign covariance matrix with unknown location," Journal of Multivariate Analysis, Elsevier, vol. 130(C), pages 107-117.
    14. Jin Wang & Weihua Zhou, 2015. "Effect of kurtosis on efficiency of some multivariate medians," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 27(3), pages 331-348, September.
    15. Ursula Laa & Dianne Cook, 2020. "Using tours to visually investigate properties of new projection pursuit indexes with application to problems in physics," Computational Statistics, Springer, vol. 35(3), pages 1171-1205, September.
    16. Virta, J., 2016. "One-step M-estimates of scatter and the independence property," Statistics & Probability Letters, Elsevier, vol. 110(C), pages 133-136.
    17. Niladri Roy Chowdhury & Dianne Cook & Heike Hofmann & Mahbubul Majumder & Eun-Kyung Lee & Amy Toth, 2015. "Using visual statistical inference to better understand random class separations in high dimension, low sample size data," Computational Statistics, Springer, vol. 30(2), pages 293-316, June.
    18. Jorge M. Arevalillo & Hilario Navarro, 2021. "Skewness-Kurtosis Model-Based Projection Pursuit with Application to Summarizing Gene Expression Data," Mathematics, MDPI, vol. 9(9), pages 1-18, April.
    19. Huang, Bei & Cook, Dianne & Wickham, Hadley, 2012. "tourrGui: A gWidgets GUI for the Tour to Explore High-Dimensional Data Using Low-Dimensional Projections," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 49(i06).
    20. Nordhausen, Klaus & Ruiz-Gazen, Anne, 2021. "On the usage of joint diagonalization in multivariate statistics," TSE Working Papers 21-1268, Toulouse School of Economics (TSE).

    More about this item

    Keywords

    genetic algorithms; Java; kurtosis; particle swarm optimization; projection index; Tribes; projection matrix; unsupervised data analysis;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tse:wpaper:122892. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/tsetofr.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.