IDEAS home Printed from https://ideas.repec.org/a/spr/aistmt/v70y2018i2d10.1007_s10463-016-0597-2.html
   My bibliography  Save this article

Model-free feature screening for ultrahigh-dimensional data conditional on some variables

Author

Listed:
  • Yi Liu

    (Chinese Academy of Sciences
    China University of Petroleum)

  • Qihua Wang

    (Chinese Academy of Sciences
    Shenzhen University)

Abstract

In this paper, the conditional distance correlation (CDC) is used as a measure of correlation to develop a conditional feature screening procedure given some significant variables for ultrahigh-dimensional data. The proposed procedure is model free and is called conditional distance correlation-sure independence screening (CDC-SIS for short). That is, we do not specify any model structure between the response and the predictors, which is appealing in some practical problems of ultrahigh-dimensional data analysis. The sure screening property of the CDC-SIS is proved and a simulation study was conducted to evaluate the finite sample performances. Real data analysis is used to illustrate the proposed method. The results indicate that CDC-SIS performs well.

Suggested Citation

  • Yi Liu & Qihua Wang, 2018. "Model-free feature screening for ultrahigh-dimensional data conditional on some variables," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(2), pages 283-301, April.
  • Handle: RePEc:spr:aistmt:v:70:y:2018:i:2:d:10.1007_s10463-016-0597-2
    DOI: 10.1007/s10463-016-0597-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10463-016-0597-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10463-016-0597-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jingyuan Liu & Runze Li & Rongling Wu, 2014. "Feature Selection for Varying Coefficient Models With Ultrahigh-Dimensional Covariates," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 266-274, March.
    2. Runze Li & Wei Zhong & Liping Zhu, 2012. "Feature Screening via Distance Correlation Learning," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(499), pages 1129-1139, September.
    3. Xueqin Wang & Wenliang Pan & Wenhao Hu & Yuan Tian & Heping Zhang, 2015. "Conditional Distance Correlation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1726-1734, December.
    4. Fan, Jianqing & Feng, Yang & Song, Rui, 2011. "Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 544-557.
    5. Jianqing Fan & Yunbei Ma & Wei Dai, 2014. "Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Varying Coefficient Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1270-1284, September.
    6. Harrison, David Jr. & Rubinfeld, Daniel L., 1978. "Hedonic housing prices and the demand for clean air," Journal of Environmental Economics and Management, Elsevier, vol. 5(1), pages 81-102, March.
    7. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jing Zhang & Haibo Zhou & Yanyan Liu & Jianwen Cai, 2021. "Conditional screening for ultrahigh-dimensional survival data in case-cohort studies," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 27(4), pages 632-661, October.
    2. Xiaochao Xia & Hao Ming, 2022. "A Flexibly Conditional Screening Approach via a Nonparametric Quantile Partial Correlation," Mathematics, MDPI, vol. 10(24), pages 1-32, December.
    3. Jing Zhang & Yanyan Liu & Hengjian Cui, 2021. "Model-free feature screening via distance correlation for ultrahigh dimensional survival data," Statistical Papers, Springer, vol. 62(6), pages 2711-2738, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wang, Christina Dan & Chen, Zhao & Lian, Yimin & Chen, Min, 2022. "Asset selection based on high frequency Sharpe ratio," Journal of Econometrics, Elsevier, vol. 227(1), pages 168-188.
    2. Lu, Jun & Lin, Lu, 2018. "Feature screening for multi-response varying coefficient models with ultrahigh dimensional predictors," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 242-254.
    3. Zhang, Shucong & Zhou, Yong, 2018. "Variable screening for ultrahigh dimensional heterogeneous data via conditional quantile correlations," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 1-13.
    4. Zhang, Shucong & Pan, Jing & Zhou, Yong, 2018. "Robust conditional nonparametric independence screening for ultrahigh-dimensional data," Statistics & Probability Letters, Elsevier, vol. 143(C), pages 95-101.
    5. Yang, Baoying & Yin, Xiangrong & Zhang, Nan, 2019. "Sufficient variable selection using independence measures for continuous response," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 480-493.
    6. Ke, Chenlu & Yang, Wei & Yuan, Qingcong & Li, Lu, 2023. "Partial sufficient variable screening with categorical controls," Computational Statistics & Data Analysis, Elsevier, vol. 187(C).
    7. Ma, Xuejun & Zhang, Jingxiao, 2016. "Robust model-free feature screening via quantile correlation," Journal of Multivariate Analysis, Elsevier, vol. 143(C), pages 472-480.
    8. Akira Shinkyu, 2023. "Forward Selection for Feature Screening and Structure Identification in Varying Coefficient Models," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 85(1), pages 485-511, February.
    9. Zhang, Shen & Zhao, Peixin & Li, Gaorong & Xu, Wangli, 2019. "Nonparametric independence screening for ultra-high dimensional generalized varying coefficient models with longitudinal data," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 37-52.
    10. Zhou, Yeqing & Liu, Jingyuan & Zhu, Liping, 2020. "Test for conditional independence with application to conditional screening," Journal of Multivariate Analysis, Elsevier, vol. 175(C).
    11. Jun Lu & Lu Lin, 2020. "Model-free conditional screening via conditional distance correlation," Statistical Papers, Springer, vol. 61(1), pages 225-244, February.
    12. Yuan, Qingcong & Chen, Xianyan & Ke, Chenlu & Yin, Xiangrong, 2022. "Independence index sufficient variable screening for categorical responses," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    13. Li, Yujie & Li, Gaorong & Lian, Heng & Tong, Tiejun, 2017. "Profile forward regression screening for ultra-high dimensional semiparametric varying coefficient partially linear models," Journal of Multivariate Analysis, Elsevier, vol. 155(C), pages 133-150.
    14. Xiaochao Xia & Hao Ming, 2022. "A Flexibly Conditional Screening Approach via a Nonparametric Quantile Partial Correlation," Mathematics, MDPI, vol. 10(24), pages 1-32, December.
    15. He, Yong & Zhang, Liang & Ji, Jiadong & Zhang, Xinsheng, 2019. "Robust feature screening for elliptical copula regression model," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 568-582.
    16. Min Chen & Yimin Lian & Zhao Chen & Zhengjun Zhang, 2017. "Sure explained variability and independence screening," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 29(4), pages 849-883, October.
    17. Xin-Bing Kong & Zhi Liu & Yuan Yao & Wang Zhou, 2017. "Sure screening by ranking the canonical correlations," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 26(1), pages 46-70, March.
    18. Zhong, Wei & Wang, Jiping & Chen, Xiaolin, 2021. "Censored mean variance sure independence screening for ultrahigh dimensional survival data," Computational Statistics & Data Analysis, Elsevier, vol. 159(C).
    19. Yan, Xiaodong & Tang, Niansheng & Xie, Jinhan & Ding, Xianwen & Wang, Zhiqiang, 2018. "Fused mean–variance filter for feature screening," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 18-32.
    20. Tang, Niansheng & Xia, Linli & Yan, Xiaodong, 2019. "Feature screening in ultrahigh-dimensional partially linear models with missing responses at random," Computational Statistics & Data Analysis, Elsevier, vol. 133(C), pages 208-227.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:aistmt:v:70:y:2018:i:2:d:10.1007_s10463-016-0597-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.