IDEAS home Printed from https://ideas.repec.org/a/spr/metron/v79y2021i2d10.1007_s40300-020-00183-5.html
   My bibliography  Save this article

Local projections for high-dimensional outlier detection

Author

Listed:
  • Thomas Ortner

    (Vienna University of Technology)

  • Peter Filzmoser

    (Vienna University of Technology)

  • Maia Rohm

    (Vienna University of Technology)

  • Sarka Brodinova

    (Vienna University of Technology)

  • Christian Breiteneder

    (Vienna University of Technology)

Abstract

A novel approach for outlier detection is proposed, called local projections, which is based on concepts of the Local Outlier Factor (LOF) (Breunig et al. in Lof: identifying density-based local outliers. In: ACM sigmod record, ACM, volume 29, pp. 93–104, 2000) and ROBPCA (Hubert et al. in Technometrics 47(1):64–79, 2005). By using aspects of both methods, this algorithm is robust towards noise variables and is capable of performing outlier detection in multi-group situations. The idea is to focus on local descriptions of the observations and their neighbors using linear projections. The outlyingness of an observation is determined by a weighted distance of the observation to all identified projection spaces, with weights depending on the appropriateness of the local description. Experiments with simulated and real data demonstrate the usefulness of this method when compared to existing outlier detection algorithms.

Suggested Citation

  • Thomas Ortner & Peter Filzmoser & Maia Rohm & Sarka Brodinova & Christian Breiteneder, 2021. "Local projections for high-dimensional outlier detection," METRON, Springer;Sapienza Università di Roma, vol. 79(2), pages 189-206, August.
  • Handle: RePEc:spr:metron:v:79:y:2021:i:2:d:10.1007_s40300-020-00183-5
    DOI: 10.1007/s40300-020-00183-5
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40300-020-00183-5
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40300-020-00183-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hubert, Mia & Van Driessen, Katrien, 2004. "Fast and robust discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 45(2), pages 301-320, March.
    2. Filzmoser, Peter & Maronna, Ricardo & Werner, Mark, 2008. "Outlier identification in high dimensions," Computational Statistics & Data Analysis, Elsevier, vol. 52(3), pages 1694-1711, January.
    3. Todorov, Valentin & Filzmoser, Peter, 2009. "An Object-Oriented Framework for Robust Multivariate Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 32(i03).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cerioli, Andrea & Farcomeni, Alessio & Riani, Marco, 2013. "Robust distances for outlier-free goodness-of-fit testing," Computational Statistics & Data Analysis, Elsevier, vol. 65(C), pages 29-45.
    2. Jan Kalina & Jan Tichavský, 2022. "The minimum weighted covariance determinant estimator for high-dimensional data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(4), pages 977-999, December.
    3. Asuman Turkmen & Nedret Billor, 2013. "Partial least squares classification for high dimensional data using the PCOUT algorithm," Computational Statistics, Springer, vol. 28(2), pages 771-788, April.
    4. Valentin Todorov & Matthias Templ & Peter Filzmoser, 2011. "Detection of multivariate outliers in business survey data with incomplete information," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 5(1), pages 37-56, April.
    5. G. Zioutas & C. Chatzinakos & T. D. Nguyen & L. Pitsoulis, 2017. "Optimization techniques for multivariate least trimmed absolute deviation estimation," Journal of Combinatorial Optimization, Springer, vol. 34(3), pages 781-797, October.
    6. Steffen Liebscher & Thomas Kirschstein, 2015. "Efficiency of the pMST and RDELA location and scatter estimators," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 99(1), pages 63-82, January.
    7. Torti, Francesca & Corbellini, Aldo & Atkinson, Anthony C., 2021. "fsdaSAS: a package for robust regression for very large datasets including the batch forward search," LSE Research Online Documents on Economics 109895, London School of Economics and Political Science, LSE Library.
    8. Thomas Triebs & Subal C. Kumbhakar, 2012. "Management Practice in Production," ifo Working Paper Series 129, ifo Institute - Leibniz Institute for Economic Research at the University of Munich.
    9. Alashwali, Fatimah & Kent, John T., 2016. "The use of a common location measure in the invariant coordinate selection and projection pursuit," Journal of Multivariate Analysis, Elsevier, vol. 152(C), pages 145-161.
    10. Alper Sinan & B. Barıs Alkan, 2015. "A useful approach to identify the multicollinearity in the presence of outliers," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(5), pages 986-993, May.
    11. B. Barış Alkan, 2016. "Robust Principal Component Analysis Based on Modified Minimum Covariance Determinant in the Presence of Outliers," Alphanumeric Journal, Bahadir Fatih Yildirim, vol. 4(2), pages 85-94, September.
    12. David E. Tyler & Frank Critchley & Lutz Dümbgen & Hannu Oja, 2009. "Invariant co‐ordinate selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 549-592, June.
    13. M. Hubert & P. Rousseeuw & K. Vakili, 2014. "Shape bias of robust covariance estimators: an empirical study," Statistical Papers, Springer, vol. 55(1), pages 15-28, February.
    14. Marco Riani & Andrea Cerioli & Francesca Torti, 2014. "On consistency factors and efficiency of robust S-estimators," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 23(2), pages 356-387, June.
    15. Matthias Kohl & Peter Ruckdeschel & Helmut Rieder, 2010. "Infinitesimally Robust estimation in general smoothly parametrized models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 19(3), pages 333-354, August.
    16. Junlong Zhao & Chao Liu & Lu Niu & Chenlei Leng, 2019. "Multiple influential point detection in high dimensional regression spaces," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 385-408, April.
    17. Van Aelst, S. & Vandervieren, E. & Willems, G., 2012. "A Stahel–Donoho estimator based on huberized outlyingness," Computational Statistics & Data Analysis, Elsevier, vol. 56(3), pages 531-542.
    18. Bilodeau, Martin & Micheaux, Pierre Lafaye de & Mahdi, Smail, 2015. "The R Package groc for Generalized Regression on Orthogonal Components," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 65(i01).
    19. Gianna S. Monti & Peter Filzmoser & Roland C. Deutsch, 2018. "A Robust Approach to Risk Assessment Based on Species Sensitivity Distributions," Risk Analysis, John Wiley & Sons, vol. 38(10), pages 2073-2086, October.
    20. C. Chatzinakos & L. Pitsoulis & G. Zioutas, 2016. "Optimization techniques for robust multivariate location and scatter estimation," Journal of Combinatorial Optimization, Springer, vol. 31(4), pages 1443-1460, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metron:v:79:y:2021:i:2:d:10.1007_s40300-020-00183-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.