IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v173y2022ics0167947322000846.html
   My bibliography  Save this article

Feature screening and FDR control with knockoff features for ultrahigh-dimensional right-censored data

Author

Listed:
  • Pan, Yingli

Abstract

A model-free feature screening method for ultrahigh-dimensional right-censored data is advocated. A two-step approach, with the help of knockoff features, is proposed to specify the threshold for feature screening such that the false discovery rate (FDR) is controlled under a prespecified level. The proposed two-step approach enjoys both a sure screening property with high probability and FDR control simultaneously if the prespecified FDR level is greater than or equal to 1/s, where s is the number of active features. The finite sample properties of the newly suggested method are assessed through simulation studies. An application to the mantle cell lymphoma (MCL) study demonstrates the utility of the proposed method in practice.

Suggested Citation

  • Pan, Yingli, 2022. "Feature screening and FDR control with knockoff features for ultrahigh-dimensional right-censored data," Computational Statistics & Data Analysis, Elsevier, vol. 173(C).
  • Handle: RePEc:eee:csdana:v:173:y:2022:i:c:s0167947322000846
    DOI: 10.1016/j.csda.2022.107504
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947322000846
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2022.107504?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jing Zhang & Guosheng Yin & Yanyan Liu & Yuanshan Wu, 2018. "Censored cumulative residual independent screening for ultrahigh-dimensional survival data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 24(2), pages 273-292, April.
    2. Jingyuan Liu & Runze Li & Rongling Wu, 2014. "Feature Selection for Varying Coefficient Models With Ultrahigh-Dimensional Covariates," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(505), pages 266-274, March.
    3. Jinchi Lv & Jun S. Liu, 2014. "Model selection principles in misspecified models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 141-167, January.
    4. Yingying Fan & Jinchi Lv & Mahrad Sharifvaghefi & Yoshimasa Uematsu, 2020. "IPAD: Stable Interpretable Forecasting with Knockoffs Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(532), pages 1822-1834, December.
    5. Yuanshan Wu & Guosheng Yin, 2015. "Conditional quantile screening in ultrahigh-dimensional heterogeneous data," Biometrika, Biometrika Trust, vol. 102(1), pages 65-76.
    6. Zhao, Sihai Dave & Li, Yi, 2012. "Principled sure independence screening for Cox models with ultra-high-dimensional covariates," Journal of Multivariate Analysis, Elsevier, vol. 105(1), pages 397-411.
    7. Rui Song & Wenbin Lu & Shuangge Ma & X. Jessie Jeng, 2014. "Censored rank independence screening for high-dimensional survival data," Biometrika, Biometrika Trust, vol. 101(4), pages 799-814.
    8. Fan, Jianqing & Feng, Yang & Song, Rui, 2011. "Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 544-557.
    9. Hengjian Cui & Runze Li & Wei Zhong, 2015. "Model-Free Feature Screening for Ultrahigh Dimensional Discriminant Analysis," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(510), pages 630-641, June.
    10. Anders Gorst-Rasmussen & Thomas Scheike, 2013. "Independent screening for single-index hazard rate models with ultrahigh dimensional features," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(2), pages 217-246, March.
    11. Emmanuel Candès & Yingying Fan & Lucas Janson & Jinchi Lv, 2018. "Panning for gold: ‘model‐X’ knockoffs for high dimensional controlled variable selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(3), pages 551-577, June.
    12. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    13. Zhang, Jing & Liu, Yanyan & Wu, Yuanshan, 2017. "Correlation rank screening for ultrahigh-dimensional survival data," Computational Statistics & Data Analysis, Elsevier, vol. 108(C), pages 121-132.
    14. Yingying Fan & Emre Demirkaya & Gaorong Li & Jinchi Lv, 2020. "RANK: Large-Scale Inference With Graphical Nonlinear Knockoffs," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(529), pages 362-379, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Konstantin Gorgen & Abdolreza Nazemi & Melanie Schienle, 2022. "Robust Knockoffs for Controlling False Discoveries With an Application to Bond Recovery Rates," Papers 2206.06026, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jing Zhang & Haibo Zhou & Yanyan Liu & Jianwen Cai, 2021. "Conditional screening for ultrahigh-dimensional survival data in case-cohort studies," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 27(4), pages 632-661, October.
    2. Zhong, Wei & Wang, Jiping & Chen, Xiaolin, 2021. "Censored mean variance sure independence screening for ultrahigh dimensional survival data," Computational Statistics & Data Analysis, Elsevier, vol. 159(C).
    3. Jing Zhang & Yanyan Liu & Hengjian Cui, 2021. "Model-free feature screening via distance correlation for ultrahigh dimensional survival data," Statistical Papers, Springer, vol. 62(6), pages 2711-2738, December.
    4. Chen, Xiaolin & Chen, Xiaojing & Wang, Hong, 2018. "Robust feature screening for ultra-high dimensional right censored data via distance correlation," Computational Statistics & Data Analysis, Elsevier, vol. 119(C), pages 118-138.
    5. Jing Zhang & Haibo Zhou & Yanyan Liu & Jianwen Cai, 2021. "Feature screening for case‐cohort studies with failure time outcome," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(1), pages 349-370, March.
    6. Jing Pan & Yuan Yu & Yong Zhou, 2018. "Nonparametric independence feature screening for ultrahigh-dimensional survival data," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 81(7), pages 821-847, October.
    7. Jing Zhang & Qihua Wang & Xuan Wang, 2022. "Surrogate-variable-based model-free feature screening for survival data under the general censoring mechanism," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(2), pages 379-397, April.
    8. Jing Zhang & Guosheng Yin & Yanyan Liu & Yuanshan Wu, 2018. "Censored cumulative residual independent screening for ultrahigh-dimensional survival data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 24(2), pages 273-292, April.
    9. Jinfeng Xu & Wai Keung Li & Zhiliang Ying, 2020. "Variable screening for survival data in the presence of heterogeneous censoring," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(4), pages 1171-1191, December.
    10. Zhang, Jing & Liu, Yanyan & Wu, Yuanshan, 2017. "Correlation rank screening for ultrahigh-dimensional survival data," Computational Statistics & Data Analysis, Elsevier, vol. 108(C), pages 121-132.
    11. Liu, Yanyan & Zhang, Jing & Zhao, Xingqiu, 2018. "A new nonparametric screening method for ultrahigh-dimensional survival data," Computational Statistics & Data Analysis, Elsevier, vol. 119(C), pages 74-85.
    12. Xiaolin Chen & Catherine Chunling Liu & Sheng Xu, 2021. "An efficient algorithm for joint feature screening in ultrahigh-dimensional Cox’s model," Computational Statistics, Springer, vol. 36(2), pages 885-910, June.
    13. Xiaolin Chen & Yi Liu & Qihua Wang, 2019. "Joint feature screening for ultra-high-dimensional sparse additive hazards model by the sparsity-restricted pseudo-score estimator," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 71(5), pages 1007-1031, October.
    14. Qu, Lianqiang & Wang, Xiaoyu & Sun, Liuquan, 2022. "Variable screening for varying coefficient models with ultrahigh-dimensional survival data," Computational Statistics & Data Analysis, Elsevier, vol. 172(C).
    15. Grace Y. Yi & Wenqing He & Raymond. J. Carroll, 2022. "Feature screening with large‐scale and high‐dimensional survival data," Biometrics, The International Biometric Society, vol. 78(3), pages 894-907, September.
    16. Hyokyoung G. Hong & Xuerong Chen & David C. Christiani & Yi Li, 2018. "Integrated powered density: Screening ultrahigh dimensional covariates with survival outcomes," Biometrics, The International Biometric Society, vol. 74(2), pages 421-429, June.
    17. Zhang, Shucong & Zhou, Yong, 2018. "Variable screening for ultrahigh dimensional heterogeneous data via conditional quantile correlations," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 1-13.
    18. Tang, Niansheng & Xia, Linli & Yan, Xiaodong, 2019. "Feature screening in ultrahigh-dimensional partially linear models with missing responses at random," Computational Statistics & Data Analysis, Elsevier, vol. 133(C), pages 208-227.
    19. Fengli Song & Peng Lai & Baohua Shen, 2020. "Robust composite weighted quantile screening for ultrahigh dimensional discriminant analysis," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(7), pages 799-820, October.
    20. Jianglin Fang, 2021. "Feature screening for ultrahigh-dimensional survival data when failure indicators are missing at random," Statistical Papers, Springer, vol. 62(3), pages 1141-1166, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:173:y:2022:i:c:s0167947322000846. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.