IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v111y2017icp116-130.html
   My bibliography  Save this article

Robust and sparse estimators for linear regression models

Author

Listed:
  • Smucler, Ezequiel
  • Yohai, Victor J.

Abstract

Penalized regression estimators are popular tools for the analysis of sparse and high-dimensional models. However, penalized regression estimators defined using an unbounded loss function can be very sensitive to the presence of outlying observations, especially to high leverage outliers. The robust and asymptotic properties of ℓ1-penalized MM-estimators and MM-estimators with an adaptive ℓ1 penalty are studied. For the case of a fixed number of covariates, the asymptotic distribution of the estimators is derived and it is proven that for the case of an adaptive ℓ1 penalty, the resulting estimator can have the oracle property. The advantages of the proposed estimators are demonstrated through an extensive simulation study and the analysis of real data sets. The proofs of the theoretical results are available in the Supplementary material to this article (see Appendix A).

Suggested Citation

  • Smucler, Ezequiel & Yohai, Victor J., 2017. "Robust and sparse estimators for linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 111(C), pages 116-130.
  • Handle: RePEc:eee:csdana:v:111:y:2017:i:c:p:116-130
    DOI: 10.1016/j.csda.2017.02.002
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947317300221
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2017.02.002?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    2. Lan Wang & Runze Li, 2009. "Weighted Wilcoxon-Type Smoothly Clipped Absolute Deviation Method," Biometrics, The International Biometric Society, vol. 65(2), pages 564-571, June.
    3. Maronna, Ricardo A. & Yohai, Victor J., 2015. "High finite-sample efficiency and robustness based on distance-constrained maximum likelihood," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 262-274.
    4. Gijbels, I. & Vrinssen, I., 2015. "Robust nonnegative garrote variable selection in linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 85(C), pages 1-22.
    5. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    6. Wang, Hansheng & Li, Guodong & Jiang, Guohua, 2007. "Robust Regression Shrinkage and Consistent Variable Selection Through the LAD-Lasso," Journal of Business & Economic Statistics, American Statistical Association, vol. 25, pages 347-355, July.
    7. Alfons, Andreas & Croux, Christophe & Gelper, Sarah, 2016. "Robust groupwise least angle regression," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 421-435.
    8. Hössjer, Ola, 1992. "On the optimality of S-estimators," Statistics & Probability Letters, Elsevier, vol. 14(5), pages 413-419, July.
    9. Brent Johnson & Limin Peng, 2008. "Rank-based variable selection," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 20(3), pages 241-252.
    10. Khan, Jafar A. & Van Aelst, Stefan & Zamar, Ruben H., 2007. "Robust Linear Model Selection Based on Least Angle Regression," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1289-1299, December.
    11. Xueqin Wang & Yunlu Jiang & Mian Huang & Heping Zhang, 2013. "Robust Variable Selection With Exponential Squared Loss," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(502), pages 632-643, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
    2. Kepplinger, David, 2023. "Robust variable selection and estimation via adaptive elastic net S-estimators for linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 183(C).
    3. Bottmer, Lea & Croux, Christophe & Wilms, Ines, 2022. "Sparse regression for large data sets with outliers," European Journal of Operational Research, Elsevier, vol. 297(2), pages 782-794.
    4. Lila, Maurício Franca & Meira, Erick & Cyrino Oliveira, Fernando Luiz, 2022. "Forecasting unemployment in Brazil: A robust reconciliation approach using hierarchical data," Socio-Economic Planning Sciences, Elsevier, vol. 82(PB).
    5. Luca Insolia & Ana Kenney & Francesca Chiaromonte & Giovanni Felici, 2022. "Simultaneous feature selection and outlier detection with optimality guarantees," Biometrics, The International Biometric Society, vol. 78(4), pages 1592-1603, December.
    6. Ricardo A. Maronna & Víctor J. Yohai, 2018. "Discussion of “The power of monitoring: how to make the most of a contaminated multivariate sample” by Andrea Cerioli, Marco Riani, Anthony C. Atkinson and Aldo Corbellini," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 27(4), pages 603-604, December.
    7. Kalogridis, Ioannis & Van Aelst, Stefan, 2023. "Robust penalized estimators for functional linear regression," Journal of Multivariate Analysis, Elsevier, vol. 194(C).
    8. Junlong Zhao & Chao Liu & Lu Niu & Chenlei Leng, 2019. "Multiple influential point detection in high dimensional regression spaces," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 385-408, April.
    9. Tianxiang Liu & Ting Kei Pong & Akiko Takeda, 2019. "A refined convergence analysis of $$\hbox {pDCA}_{e}$$ pDCA e with applications to simultaneous sparse recovery and outlier detection," Computational Optimization and Applications, Springer, vol. 73(1), pages 69-100, May.
    10. Ana M. Bianco & Graciela Boente & Gonzalo Chebi, 2022. "Penalized robust estimators in sparse logistic regression," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(3), pages 563-594, September.
    11. Jun Zhao & Guan’ao Yan & Yi Zhang, 2022. "Robust estimation and shrinkage in ultrahigh dimensional expectile regression with heavy tails and variance heterogeneity," Statistical Papers, Springer, vol. 63(1), pages 1-28, February.
    12. Thompson, Ryan, 2022. "Robust subset selection," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).
    13. Wang, Yibo & Karunamuni, Rohana J., 2022. "High-dimensional robust regression with Lq-loss functions," Computational Statistics & Data Analysis, Elsevier, vol. 176(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mingqiu Wang & Guo-Liang Tian, 2016. "Robust group non-convex estimations for high-dimensional partially linear models," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(1), pages 49-67, March.
    2. Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
    3. Gijbels, I. & Vrinssen, I., 2015. "Robust nonnegative garrote variable selection in linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 85(C), pages 1-22.
    4. Qingguo Tang & R. J. Karunamuni, 2018. "Robust variable selection for finite mixture regression models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(3), pages 489-521, June.
    5. N. Neykov & P. Filzmoser & P. Neytchev, 2014. "Ultrahigh dimensional variable selection through the penalized maximum trimmed likelihood estimator," Statistical Papers, Springer, vol. 55(1), pages 187-207, February.
    6. Long Feng & Changliang Zou & Zhaojun Wang & Xianwu Wei & Bin Chen, 2015. "Robust spline-based variable selection in varying coefficient model," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 78(1), pages 85-118, January.
    7. Song, Yunquan & Liang, Xijun & Zhu, Yanji & Lin, Lu, 2021. "Robust variable selection with exponential squared loss for the spatial autoregressive model," Computational Statistics & Data Analysis, Elsevier, vol. 155(C).
    8. Tianfa Xie & Ruiyuan Cao & Jiang Du, 2020. "Variable selection for spatial autoregressive models with a diverging number of parameters," Statistical Papers, Springer, vol. 61(3), pages 1125-1145, June.
    9. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    10. Thompson, Ryan, 2022. "Robust subset selection," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).
    11. Yang Peng & Bin Luo & Xiaoli Gao, 2022. "Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 84(2), pages 694-707, November.
    12. Weichi Wu & Zhou Zhou, 2017. "Nonparametric Inference for Time-Varying Coefficient Quantile Regression," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 35(1), pages 98-109, January.
    13. Jiang, Rong & Qian, Weimin & Zhou, Zhangong, 2012. "Variable selection and coefficient estimation via composite quantile regression with randomly censored data," Statistics & Probability Letters, Elsevier, vol. 82(2), pages 308-317.
    14. Guang Cheng & Hao Zhang & Zuofeng Shang, 2015. "Sparse and efficient estimation for partial spline models with increasing dimension," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(1), pages 93-127, February.
    15. Wentao Wang & Jiaxuan Liang & Rong Liu & Yunquan Song & Min Zhang, 2022. "A Robust Variable Selection Method for Sparse Online Regression via the Elastic Net Penalty," Mathematics, MDPI, vol. 10(16), pages 1-18, August.
    16. Hu Yang & Ning Li & Jing Yang, 2020. "A robust and efficient estimation and variable selection method for partially linear models with large-dimensional covariates," Statistical Papers, Springer, vol. 61(5), pages 1911-1937, October.
    17. T. Cai & J. Huang & L. Tian, 2009. "Regularized Estimation for the Accelerated Failure Time Model," Biometrics, The International Biometric Society, vol. 65(2), pages 394-404, June.
    18. Z. John Daye & Jinbo Chen & Hongzhe Li, 2012. "High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis," Biometrics, The International Biometric Society, vol. 68(1), pages 316-326, March.
    19. Jonathan Boss & Alexander Rix & Yin‐Hsiu Chen & Naveen N. Narisetty & Zhenke Wu & Kelly K. Ferguson & Thomas F. McElrath & John D. Meeker & Bhramar Mukherjee, 2021. "A hierarchical integrative group least absolute shrinkage and selection operator for analyzing environmental mixtures," Environmetrics, John Wiley & Sons, Ltd., vol. 32(8), December.
    20. Hoai An Le Thi & Manh Cuong Nguyen, 2017. "DCA based algorithms for feature selection in multi-class support vector machine," Annals of Operations Research, Springer, vol. 249(1), pages 273-300, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:111:y:2017:i:c:p:116-130. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.