IDEAS home Printed from https://ideas.repec.org/a/taf/jnlasa/v109y2014i507p1257-1269.html
   My bibliography  Save this article

The Sparse MLE for Ultrahigh-Dimensional Feature Screening

Author

Listed:
  • Chen Xu
  • Jiahua Chen

Abstract

Feature selection is fundamental for modeling the high-dimensional data, where the number of features can be huge and much larger than the sample size. Since the feature space is so large, many traditional procedures become numerically infeasible. It is hence essential to first remove most apparently noninfluential features before any elaborative analysis. Recently, several procedures have been developed for this purpose, which include the sure-independent-screening (SIS) as a widely used technique. To gain computational efficiency, the SIS screens features based on their individual predicting power. In this article, we propose a new screening method via the sparsity-restricted maximum likelihood estimator (SMLE). The new method naturally takes the joint effects of features in the screening process, which gives itself an edge to potentially outperform the existing methods. This conjecture is further supported by the simulation studies under a number of modeling settings. We show that the proposed method is screening consistent in the context of ultrahigh-dimensional generalized linear models. Supplementary materials for this article are available online.

Suggested Citation

  • Chen Xu & Jiahua Chen, 2014. "The Sparse MLE for Ultrahigh-Dimensional Feature Screening," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1257-1269, September.
  • Handle: RePEc:taf:jnlasa:v:109:y:2014:i:507:p:1257-1269
    DOI: 10.1080/01621459.2013.879531
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/01621459.2013.879531
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/01621459.2013.879531?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jiahua Chen & Zehua Chen, 2008. "Extended Bayesian information criteria for model selection with large model spaces," Biometrika, Biometrika Trust, vol. 95(3), pages 759-771.
    2. Runze Li & Wei Zhong & Liping Zhu, 2012. "Feature Screening via Distance Correlation Learning," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(499), pages 1129-1139, September.
    3. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    4. Efron, Bradley, 2009. "Empirical Bayes Estimates for Large-Scale Prediction Problems," Journal of the American Statistical Association, American Statistical Association, vol. 104(487), pages 1015-1028.
    5. Efron, Bradley, 2007. "Correlation and Large-Scale Simultaneous Significance Testing," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 93-103, March.
    6. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    7. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    8. Wang, Hansheng, 2009. "Forward Regression for Ultra-High Dimensional Variable Screening," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1512-1524.
    9. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Qiu, Debin & Ahn, Jeongyoun, 2020. "Grouped variable screening for ultra-high dimensional data for linear model," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    2. Dai, Linlin & Chen, Kani & Sun, Zhihua & Liu, Zhenqiu & Li, Gang, 2018. "Broken adaptive ridge regression and its asymptotic properties," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 334-351.
    3. Yi Chu & Lu Lin, 2020. "Conditional SIRS for nonparametric and semiparametric models by marginal empirical likelihood," Statistical Papers, Springer, vol. 61(4), pages 1589-1606, August.
    4. Linh H. Nghiem & Francis K.C. Hui & Samuel Müller & A.H. Welsh, 2023. "Screening methods for linear errors‐in‐variables models in high dimensions," Biometrics, The International Biometric Society, vol. 79(2), pages 926-939, June.
    5. Sheng, Ying & Wang, Qihua, 2020. "Model-free feature screening for ultrahigh dimensional classification," Journal of Multivariate Analysis, Elsevier, vol. 178(C).
    6. Liming Wang & Xingxiang Li & Xiaoqing Wang & Peng Lai, 2022. "Unified mean-variance feature screening for ultrahigh-dimensional regression," Computational Statistics, Springer, vol. 37(4), pages 1887-1918, September.
    7. Arun Srinivasan & Lingzhou Xue & Xiang Zhan, 2021. "Compositional knockoff filter for high‐dimensional regression analysis of microbiome data," Biometrics, The International Biometric Society, vol. 77(3), pages 984-995, September.
    8. Xiaolin Chen & Catherine Chunling Liu & Sheng Xu, 2021. "An efficient algorithm for joint feature screening in ultrahigh-dimensional Cox’s model," Computational Statistics, Springer, vol. 36(2), pages 885-910, June.
    9. Yang, Guangren & Zhang, Ling & Li, Runze & Huang, Yuan, 2019. "Feature screening in ultrahigh-dimensional varying-coefficient Cox model," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 284-297.
    10. Lu, Jun & Lin, Lu, 2018. "Feature screening for multi-response varying coefficient models with ultrahigh dimensional predictors," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 242-254.
    11. Xi Wu & Shifeng Xiong & Weiyan Mu, 2023. "An Ensemble Method for Feature Screening," Mathematics, MDPI, vol. 11(2), pages 1-14, January.
    12. Randall Reese & Guifang Fu & Geran Zhao & Xiaotian Dai & Xiaotian Li & Kenneth Chiu, 2022. "Epistasis Detection via the Joint Cumulant," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 14(3), pages 514-532, December.
    13. Xiaolin Chen & Yi Liu & Qihua Wang, 2019. "Joint feature screening for ultra-high-dimensional sparse additive hazards model by the sparsity-restricted pseudo-score estimator," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 71(5), pages 1007-1031, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    2. Dai, Linlin & Chen, Kani & Sun, Zhihua & Liu, Zhenqiu & Li, Gang, 2018. "Broken adaptive ridge regression and its asymptotic properties," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 334-351.
    3. Ruggieri, Eric & Lawrence, Charles E., 2012. "On efficient calculations for Bayesian variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1319-1332.
    4. Li, Xingxiang & Cheng, Guosheng & Wang, Liming & Lai, Peng & Song, Fengli, 2017. "Ultrahigh dimensional feature screening via projection," Computational Statistics & Data Analysis, Elsevier, vol. 114(C), pages 88-104.
    5. Liming Wang & Xingxiang Li & Xiaoqing Wang & Peng Lai, 2022. "Unified mean-variance feature screening for ultrahigh-dimensional regression," Computational Statistics, Springer, vol. 37(4), pages 1887-1918, September.
    6. Huiwen Wang & Ruiping Liu & Shanshan Wang & Zhichao Wang & Gilbert Saporta, 2020. "Ultra-high dimensional variable screening via Gram–Schmidt orthogonalization," Computational Statistics, Springer, vol. 35(3), pages 1153-1170, September.
    7. She, Yiyuan, 2012. "An iterative algorithm for fitting nonconvex penalized generalized linear models with grouped predictors," Computational Statistics & Data Analysis, Elsevier, vol. 56(10), pages 2976-2990.
    8. Wei Sun & Lexin Li, 2012. "Multiple Loci Mapping via Model-free Variable Selection," Biometrics, The International Biometric Society, vol. 68(1), pages 12-22, March.
    9. Xiangyu Wang & Chenlei Leng, 2016. "High dimensional ordinary least squares projection for screening variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(3), pages 589-611, June.
    10. Wang, Christina Dan & Chen, Zhao & Lian, Yimin & Chen, Min, 2022. "Asset selection based on high frequency Sharpe ratio," Journal of Econometrics, Elsevier, vol. 227(1), pages 168-188.
    11. Zhao, Bangxin & Liu, Xin & He, Wenqing & Yi, Grace Y., 2021. "Dynamic tilted current correlation for high dimensional variable screening," Journal of Multivariate Analysis, Elsevier, vol. 182(C).
    12. Min Chen & Yimin Lian & Zhao Chen & Zhengjun Zhang, 2017. "Sure explained variability and independence screening," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 29(4), pages 849-883, October.
    13. Zhang, Shucong & Zhou, Yong, 2018. "Variable screening for ultrahigh dimensional heterogeneous data via conditional quantile correlations," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 1-13.
    14. Zhao, Shaofei & Fu, Guifang, 2022. "Distribution-free and model-free multivariate feature screening via multivariate rank distance correlation," Journal of Multivariate Analysis, Elsevier, vol. 192(C).
    15. Aneiros, Germán & Novo, Silvia & Vieu, Philippe, 2022. "Variable selection in functional regression models: A review," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    16. Qinqin Hu & Lu Lin, 2022. "Feature Screening in High Dimensional Regression with Endogenous Covariates," Computational Economics, Springer;Society for Computational Economics, vol. 60(3), pages 949-969, October.
    17. Christis Katsouris, 2023. "High Dimensional Time Series Regression Models: Applications to Statistical Learning Methods," Papers 2308.16192, arXiv.org.
    18. Sheng, Ying & Wang, Qihua, 2020. "Model-free feature screening for ultrahigh dimensional classification," Journal of Multivariate Analysis, Elsevier, vol. 178(C).
    19. Huang, Qiming & Zhu, Yu, 2016. "Model-free sure screening via maximum correlation," Journal of Multivariate Analysis, Elsevier, vol. 148(C), pages 89-106.
    20. Sweata Sen & Damitri Kundu & Kiranmoy Das, 2023. "Variable selection for categorical response: a comparative study," Computational Statistics, Springer, vol. 38(2), pages 809-826, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:jnlasa:v:109:y:2014:i:507:p:1257-1269. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UASA20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.