IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v166y2018icp32-49.html
   My bibliography  Save this article

On masking and swamping robustness of leading nonparametric outlier identifiers for multivariate data

Author

Listed:
  • Wang, Shanshan
  • Serfling, Robert

Abstract

For any outlier detection procedure, a key concern is robustness with respect to possible misclassification errors, masking (Type I) and swamping (Type II). Although parametric model-based simulation results are informative, one also desires nonparametric masking and robustness measures that are more broadly applicable. To this effect, notions of finite-sample masking and swamping breakdown points formulated abstractly for outlyingness functions in arbitrary data settings (Serfling and Wang, 2014) are introduced in the present paper into the multivariate data setting. Formulas for the measures are derived for three important affine invariant nonparametric multivariate outlyingness functions: Mahalanobis distance, Mahalanobis spatial, and projection. Using the formulas, favorable masking and swamping breakdown points, balanced equally, are seen for the Mahalanobis distance outlyingness using minimum covariance determinant (MCD) location and scatter estimators, and likewise for the projection outlyingness with median and MAD for univariate location and scale. Also, Mahalanobis spatial outlyingness with MCD standardization is competitive when swamping robustness is given higher priority than masking robustness. A small simulation study with bivariate contaminated standard normal and contaminated exponential models yields results consistent with the theoretical formulas. Some practical recommendations are discussed.

Suggested Citation

  • Wang, Shanshan & Serfling, Robert, 2018. "On masking and swamping robustness of leading nonparametric outlier identifiers for multivariate data," Journal of Multivariate Analysis, Elsevier, vol. 166(C), pages 32-49.
  • Handle: RePEc:eee:jmvana:v:166:y:2018:i:c:p:32-49
    DOI: 10.1016/j.jmva.2018.02.003
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X1730132X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2018.02.003?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Serfling, Robert & Mazumder, Satyaki, 2009. "Exponential probability inequality and convergence results for the median absolute deviation and its modifications," Statistics & Probability Letters, Elsevier, vol. 79(16), pages 1767-1773, August.
    2. Robert Serfling, 2010. "Equivariance and invariance properties of multivariate quantile and related functions, and the role of standardisation," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 22(7), pages 915-936.
    3. Liu, Xiaohui & Zuo, Yijun, 2015. "CompPD: A MATLAB Package for Computing Projection Depth," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 65(i02).
    4. Robert Serfling & Satyaki Mazumder, 2013. "Computationally easy outlier detection via projection pursuit with finitely many directions," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 25(2), pages 447-461, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gloria Gonzalez‐Rivera & Yun Luo & Esther Ruiz, 2020. "Prediction regions for interval‐valued time series," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 35(4), pages 373-390, June.
    2. Yi He & John H. J. Einmahl, 2017. "Estimation of extreme depth-based quantile regions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(2), pages 449-461, March.
    3. Loperfido, Nicola, 2018. "Skewness-based projection pursuit: A computational approach," Computational Statistics & Data Analysis, Elsevier, vol. 120(C), pages 42-57.
    4. Liu, Xiaohui & Rahman, Jafer & Luo, Shihua, 2019. "Generalized and robustified empirical depths for multivariate data," Statistics & Probability Letters, Elsevier, vol. 146(C), pages 70-79.
    5. Marc Hallin & Davy Paindaveine & Miroslav Siman, 2008. "Multivariate quantiles and multiple-output regression quantiles: from L1 optimization to halfspace depth," Working Papers ECARES 2008_042, ULB -- Universite Libre de Bruxelles.
    6. Davy Paindaveine & Germain Van Bever, 2017. "Halfspace Depths for Scatter, Concentration and Shape Matrices," Working Papers ECARES ECARES 2017-19, ULB -- Universite Libre de Bruxelles.
    7. Yves Dominicy & Pauliina Ilmonen & David Veredas, 2017. "Multivariate Hill Estimators," International Statistical Review, International Statistical Institute, vol. 85(1), pages 108-142, April.
    8. Kosiorowski Daniel & Jerzy P. Rydlewski, 2019. "Centrality-oriented Causality -- A Study of EU Agricultural Subsidies and Digital Developement in Poland," Papers 1908.11099, arXiv.org, revised Sep 2019.
    9. Serfling, Robert & Wijesuriya, Uditha, 2017. "Depth-based nonparametric description of functional data, with emphasis on use of spatial depth," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 24-45.
    10. Davy Paindaveine & Germain Van Bever, 2015. "Discussion of “Multivariate Functional Outlier Detection”, by Mia Hubert, Peter Rousseeuw and Pieter Segaert," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 24(2), pages 223-231, July.
    11. P. Navarro-Esteban & J. A. Cuesta-Albertos, 2021. "High-dimensional outlier detection using random projections," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(4), pages 908-934, December.
    12. Marcel, Bräutigam & Marie, Kratz, 2018. "On the Dependence between Quantiles and Dispersion Estimators," ESSEC Working Papers WP1807, ESSEC Research Center, ESSEC Business School.
    13. Nordhausen, Klaus & Ruiz-Gazen, Anne, 2021. "On the usage of joint diagonalization in multivariate statistics," TSE Working Papers 21-1268, Toulouse School of Economics (TSE).
    14. Klaus Nordhausen & Anne Ruiz-Gazen, 2022. "On the usage of joint diagonalization in multivariate statistics," Post-Print hal-04296111, HAL.
    15. Daniel Kosiorowski & Jerzy P. Rydlewski, 2020. "Centrality-oriented causality. A study of EU agricultural subsidies and digital developement in Poland," Operations Research and Decisions, Wroclaw University of Science and Technology, Faculty of Management, vol. 30(3), pages 47-63.
    16. Dai, Wenlin & Genton, Marc G., 2019. "Directional outlyingness for multivariate functional data," Computational Statistics & Data Analysis, Elsevier, vol. 131(C), pages 50-65.
    17. M. Shams, 2021. "On weakly equivariant estimators," Statistical Papers, Springer, vol. 62(4), pages 1611-1650, August.
    18. Xin Dang & Hailin Sang & Lauren Weatherall, 2019. "Gini covariance matrix and its affine equivariant version," Statistical Papers, Springer, vol. 60(3), pages 641-666, June.
    19. Ramsay, Kelly & Durocher, Stephane & Leblanc, Alexandre, 2021. "Robustness and asymptotics of the projection median," Journal of Multivariate Analysis, Elsevier, vol. 181(C).
    20. Nagatsuka, Hideki & Kawakami, Hiroshi & Kamakura, Toshinari & Yamamoto, Hisashi, 2013. "The exact finite-sample distribution of the median absolute deviation about the median of continuous random variables," Statistics & Probability Letters, Elsevier, vol. 83(4), pages 999-1005.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:166:y:2018:i:c:p:32-49. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.