IDEAS home Printed from https://ideas.repec.org/a/spr/sankhb/v84y2022i2d10.1007_s13571-022-00279-0.html
   My bibliography  Save this article

Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection

Author

Listed:
  • Yang Peng

    (Department of Mathematics and Statistics University of North Carolina at Greensboro)

  • Bin Luo

    (Duke University)

  • Xiaoli Gao

    (Department of Mathematics and Statistics University of North Carolina at Greensboro)

Abstract

Outlier detection has become an important and challenging issue in high-dimensional data analysis due to the coexistence of data contamination and high-dimensionality. Most existing widely used penalized least squares methods are sensitive to outliers due to the l2 loss. In this paper, we proposed a Robust Moderately Clipped LASSO (RMCL) estimator, that performs simultaneous outlier detection, variable selection and robust estimation. The RMCL estimator can be efficiently solved using the coordinate descent algorithm in a convex-concave procedure. Our numerical studies demonstrate that the RMCL estimator possesses superiority in both variable selection and outlier detection and thus can be advantageous in difficult prediction problems with data contamination.

Suggested Citation

  • Yang Peng & Bin Luo & Xiaoli Gao, 2022. "Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 84(2), pages 694-707, November.
  • Handle: RePEc:spr:sankhb:v:84:y:2022:i:2:d:10.1007_s13571-022-00279-0
    DOI: 10.1007/s13571-022-00279-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13571-022-00279-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13571-022-00279-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    2. Kwon, Sunghoon & Lee, Sangin & Kim, Yongdai, 2015. "Moderately clipped LASSO," Computational Statistics & Data Analysis, Elsevier, vol. 92(C), pages 53-67.
    3. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    4. She, Yiyuan & Owen, Art B., 2011. "Outlier Detection Using Nonconvex Penalized Regression," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 626-639.
    5. Gijbels, I. & Vrinssen, I., 2015. "Robust nonnegative garrote variable selection in linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 85(C), pages 1-22.
    6. Xueqin Wang & Yunlu Jiang & Mian Huang & Heping Zhang, 2013. "Robust Variable Selection With Exponential Squared Loss," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(502), pages 632-643, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
    2. Smucler, Ezequiel & Yohai, Victor J., 2017. "Robust and sparse estimators for linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 111(C), pages 116-130.
    3. Mingqiu Wang & Guo-Liang Tian, 2016. "Robust group non-convex estimations for high-dimensional partially linear models," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(1), pages 49-67, March.
    4. Z. John Daye & Jinbo Chen & Hongzhe Li, 2012. "High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis," Biometrics, The International Biometric Society, vol. 68(1), pages 316-326, March.
    5. Song, Yunquan & Liang, Xijun & Zhu, Yanji & Lin, Lu, 2021. "Robust variable selection with exponential squared loss for the spatial autoregressive model," Computational Statistics & Data Analysis, Elsevier, vol. 155(C).
    6. Kangning Wang & Lu Lin, 2019. "Robust and efficient estimator for simultaneous model structure identification and variable selection in generalized partial linear varying coefficient models with longitudinal data," Statistical Papers, Springer, vol. 60(5), pages 1649-1676, October.
    7. Kepplinger, David, 2023. "Robust variable selection and estimation via adaptive elastic net S-estimators for linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 183(C).
    8. Wentao Wang & Jiaxuan Liang & Rong Liu & Yunquan Song & Min Zhang, 2022. "A Robust Variable Selection Method for Sparse Online Regression via the Elastic Net Penalty," Mathematics, MDPI, vol. 10(16), pages 1-18, August.
    9. Qingguo Tang & R. J. Karunamuni, 2018. "Robust variable selection for finite mixture regression models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(3), pages 489-521, June.
    10. Tianfa Xie & Ruiyuan Cao & Jiang Du, 2020. "Variable selection for spatial autoregressive models with a diverging number of parameters," Statistical Papers, Springer, vol. 61(3), pages 1125-1145, June.
    11. Thompson, Ryan, 2022. "Robust subset selection," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).
    12. Yunquan Song & Yaqi Liu & Hang Su, 2022. "Robust Variable Selection for Single-Index Varying-Coefficient Model with Missing Data in Covariates," Mathematics, MDPI, vol. 10(12), pages 1-14, June.
    13. Sunghoon Kwon & Jeongyoun Ahn & Woncheol Jang & Sangin Lee & Yongdai Kim, 2017. "A doubly sparse approach for group variable selection," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(5), pages 997-1025, October.
    14. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    15. Xu, Yang & Zhao, Shishun & Hu, Tao & Sun, Jianguo, 2021. "Variable selection for generalized odds rate mixture cure models with interval-censored failure time data," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    16. Emmanouil Androulakis & Christos Koukouvinos & Kalliopi Mylona & Filia Vonta, 2010. "A real survival analysis application via variable selection methods for Cox's proportional hazards model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 37(8), pages 1399-1406.
    17. Jun Zhu & Hsin‐Cheng Huang & Perla E. Reyes, 2010. "On selection of spatial linear models for lattice data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(3), pages 389-402, June.
    18. Lam, Clifford, 2008. "Estimation of large precision matrices through block penalization," LSE Research Online Documents on Economics 31543, London School of Economics and Political Science, LSE Library.
    19. Ping Wu & Xinchao Luo & Peirong Xu & Lixing Zhu, 2017. "New variable selection for linear mixed-effects models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(3), pages 627-646, June.
    20. Naimoli, Antonio, 2022. "Modelling the persistence of Covid-19 positivity rate in Italy," Socio-Economic Planning Sciences, Elsevier, vol. 82(PA).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:sankhb:v:84:y:2022:i:2:d:10.1007_s13571-022-00279-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.