IDEAS home Printed from https://ideas.repec.org/a/spr/metrik/v86y2023i4d10.1007_s00184-022-00874-1.html
   My bibliography  Save this article

Robust regression against heavy heterogeneous contamination

Author

Listed:
  • Takayuki Kawashima

    (Tokyo Insitute of Technology/RIKEN)

  • Hironori Fujisawa

    (The Institute of Statistical Mathematics/RIKEN)

Abstract

The $$\gamma $$ γ -divergence is well-known for having strong robustness against heavy contamination. By virtue of this property, many applications via the $$\gamma $$ γ -divergence have been proposed. There are two types of $$\gamma $$ γ -divergence for the regression problem, in which the base measures are handled differently. In this study, these two $$\gamma $$ γ -divergences are compared, and a large difference is found between them under heterogeneous contamination, where the outlier ratio depends on the explanatory variable. One $$\gamma $$ γ -divergence has the strong robustness even under heterogeneous contamination. The other does not have in general; however, it has under homogeneous contamination, where the outlier ratio does not depend on the explanatory variable, or when the parametric model of the response variable belongs to a location-scale family in which the scale does not depend on the explanatory variables. Hung et al. (Biometrics 74(1):145–154, 2018) discussed the strong robustness in a logistic regression model with an additional assumption that the tuning parameter $$\gamma $$ γ is sufficiently large. The results obtained in this study hold for any parametric model without such an additional assumption.

Suggested Citation

  • Takayuki Kawashima & Hironori Fujisawa, 2023. "Robust regression against heavy heterogeneous contamination," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 86(4), pages 421-442, May.
  • Handle: RePEc:spr:metrik:v:86:y:2023:i:4:d:10.1007_s00184-022-00874-1
    DOI: 10.1007/s00184-022-00874-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00184-022-00874-1
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00184-022-00874-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Dankmar Böhning & Bruce Lindsay, 1988. "Monotonicity of quadratic-approximation algorithms," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 40(4), pages 641-663, December.
    2. Riani, Marco & Atkinson, Anthony C. & Corbellini, Aldo & Perrotta, Domenico, 2020. "Robust regression with density power divergence: theory, comparisons, and data analysis," LSE Research Online Documents on Economics 103931, London School of Economics and Political Science, LSE Library.
    3. Hung Hung & Zhi†Yu Jou & Su†Yun Huang, 2018. "Robust mislabel logistic regression without modeling mislabel probabilities," Biometrics, The International Biometric Society, vol. 74(1), pages 145-154, March.
    4. Takafumi Kanamori & Hironori Fujisawa, 2015. "Robust estimation under heavy contamination using unnormalized models," Biometrika, Biometrika Trust, vol. 102(3), pages 559-572.
    5. Fujisawa, Hironori & Eguchi, Shinto, 2008. "Robust parameter estimation with a small bias against heavy contamination," Journal of Multivariate Analysis, Elsevier, vol. 99(9), pages 2053-2081, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hirose, Kei & Fujisawa, Hironori & Sese, Jun, 2017. "Robust sparse Gaussian graphical modeling," Journal of Multivariate Analysis, Elsevier, vol. 161(C), pages 172-190.
    2. Mingyang Ren & Sanguo Zhang & Qingzhao Zhang, 2021. "Robust high-dimensional regression for data with anomalous responses," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(4), pages 703-736, August.
    3. Hung Hung & Zhi†Yu Jou & Su†Yun Huang, 2018. "Robust mislabel logistic regression without modeling mislabel probabilities," Biometrics, The International Biometric Society, vol. 74(1), pages 145-154, March.
    4. Arun Kumar Kuchibhotla & Somabha Mukherjee & Ayanendranath Basu, 2019. "Statistical inference based on bridge divergences," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 71(3), pages 627-656, June.
    5. Roussille, Nina & Scuderi, Benjamin, 2023. "Bidding for Talent: A Test of Conduct in a High-Wage Labor Market," IZA Discussion Papers 16352, Institute of Labor Economics (IZA).
    6. Wang, Fa, 2022. "Maximum likelihood estimation and inference for high dimensional generalized factor models with application to factor-augmented regressions," Journal of Econometrics, Elsevier, vol. 229(1), pages 180-200.
    7. Utkarsh J. Dang & Michael P.B. Gallaugher & Ryan P. Browne & Paul D. McNicholas, 2023. "Model-Based Clustering and Classification Using Mixtures of Multivariate Skewed Power Exponential Distributions," Journal of Classification, Springer;The Classification Society, vol. 40(1), pages 145-167, April.
    8. Abhijit Mandal & Beste Hamiye Beyaztas & Soutir Bandyopadhyay, 2023. "Robust density power divergence estimates for panel data models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 75(5), pages 773-798, October.
    9. Tian, Guo-Liang & Tang, Man-Lai & Liu, Chunling, 2012. "Accelerating the quadratic lower-bound algorithm via optimizing the shrinkage parameter," Computational Statistics & Data Analysis, Elsevier, vol. 56(2), pages 255-265.
    10. Bohning, Dankmar, 1999. "The lower bound method in probit regression," Computational Statistics & Data Analysis, Elsevier, vol. 30(1), pages 13-17, March.
    11. Francesca Torti & Aldo Corbellini & Anthony C. Atkinson, 2021. "fsdaSAS: A Package for Robust Regression for Very Large Datasets Including the Batch Forward Search," Stats, MDPI, vol. 4(2), pages 1-21, April.
    12. Liu, Wenchen & Tang, Yincai & Wu, Xianyi, 2020. "Separating variables to accelerate non-convex regularized optimization," Computational Statistics & Data Analysis, Elsevier, vol. 147(C).
    13. Catania, Leopoldo & Luati, Alessandra, 2020. "Robust estimation of a location parameter with the integrated Hogg function," Statistics & Probability Letters, Elsevier, vol. 164(C).
    14. Riani, Marco & Atkinson, Anthony Curtis & Corbellini, Aldo & Farcomeni, Alessio & Laurini, Fabrizio, 2024. "Information Criteria for Outlier Detection Avoiding Arbitrary Significance Levels," Econometrics and Statistics, Elsevier, vol. 29(C), pages 189-205.
    15. Jonathan James, 2012. "A tractable estimator for general mixed multinomial logit models," Working Papers (Old Series) 1219, Federal Reserve Bank of Cleveland.
    16. Maria E. Frey & Hans C. Petersen & Oke Gerke, 2020. "Nonparametric Limits of Agreement for Small to Moderate Sample Sizes: A Simulation Study," Stats, MDPI, vol. 3(3), pages 1-13, August.
    17. Shogo Kato & Shinto Eguchi, 2016. "Robust estimation of location and concentration parameters for the von Mises–Fisher distribution," Statistical Papers, Springer, vol. 57(1), pages 205-234, March.
    18. Durante, Daniele & Canale, Antonio & Rigon, Tommaso, 2019. "A nested expectation–maximization algorithm for latent class models with covariates," Statistics & Probability Letters, Elsevier, vol. 146(C), pages 97-103.
    19. Nicolas Depraetere & Martina Vandebroek, 2017. "A comparison of variational approximations for fast inference in mixed logit models," Computational Statistics, Springer, vol. 32(1), pages 93-125, March.
    20. Chen, Ting-Li & Fujisawa, Hironori & Huang, Su-Yun & Hwang, Chii-Ruey, 2016. "On the weak convergence and Central Limit Theorem of blurring and nonblurring processes with application to robust location estimation," Journal of Multivariate Analysis, Elsevier, vol. 143(C), pages 165-184.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:metrik:v:86:y:2023:i:4:d:10.1007_s00184-022-00874-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.