IDEAS home Printed from https://ideas.repec.org/a/eee/ecosta/v4y2017icp105-120.html
   My bibliography  Save this article

Identifying gene-environment interactions for prognosis using a robust approach

Author

Listed:
  • Chai, Hao
  • Zhang, Qingzhao
  • Jiang, Yu
  • Wang, Guohua
  • Zhang, Sanguo
  • Ahmed, Syed Ejaz
  • Ma, Shuangge

Abstract

For many complex diseases, prognosis is of essential importance. It has been shown that, beyond the main effects of genetic (G) and environmental (E) risk factors, gene-environment (G × E) interactions also play a critical role. In practical data analysis, part of the prognosis outcome data can have a distribution different from that of the rest of the data because of contamination or a mixture of subtypes. Literature has shown that data contamination as well as a mixture of distributions, if not properly accounted for, can lead to severely biased model estimation. In this study, we describe prognosis using an accelerated failure time (AFT) model. An exponential squared loss is proposed to accommodate data contamination or a mixture of distributions. A penalization approach is adopted for regularized estimation and marker selection. The proposed method is realized using an effective coordinate descent (CD) and minorization maximization (MM) algorithm. The estimation and identification consistency properties are rigorously established. Simulation shows that without contamination or mixture, the proposed method has performance comparable to or better than the nonrobust alternative. However, with contamination or mixture, it outperforms the nonrobust alternative and, under certain scenarios, is superior to the robust method based on quantile regression. The proposed method is applied to the analysis of TCGA (The Cancer Genome Atlas) lung cancer data. It identifies interactions different from those using the alternatives. The identified markers have important implications and satisfactory stability.

Suggested Citation

  • Chai, Hao & Zhang, Qingzhao & Jiang, Yu & Wang, Guohua & Zhang, Sanguo & Ahmed, Syed Ejaz & Ma, Shuangge, 2017. "Identifying gene-environment interactions for prognosis using a robust approach," Econometrics and Statistics, Elsevier, vol. 4(C), pages 105-120.
  • Handle: RePEc:eee:ecosta:v:4:y:2017:i:c:p:105-120
    DOI: 10.1016/j.ecosta.2016.10.004
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S2452306216300065
    Download Restriction: Full text for ScienceDirect subscribers only. Contains open access articles

    File URL: https://libkey.io/10.1016/j.ecosta.2016.10.004?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Stute, W., 1993. "Consistent Estimation Under Random Censorship When Covariables Are Present," Journal of Multivariate Analysis, Elsevier, vol. 45(1), pages 89-103, April.
    2. Xueqin Wang & Yunlu Jiang & Mian Huang & Heping Zhang, 2013. "Robust Variable Selection With Exponential Squared Loss," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(502), pages 632-643, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Qin, Yichen & Wang, Linna & Li, Yang & Li, Rong, 2023. "Visualization and assessment of model selection uncertainty," Computational Statistics & Data Analysis, Elsevier, vol. 178(C).
    2. Yu, Ke & Luo, Shan, 2024. "Rank-based sequential feature selection for high-dimensional accelerated failure time models with main and interaction effects," Computational Statistics & Data Analysis, Elsevier, vol. 197(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ioannis Kalogridis & Gerda Claeskens & Stefan Aelst, 2023. "Robust and efficient estimation of nonparametric generalized linear models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(3), pages 1055-1078, September.
    2. Olivier Lopez & Xavier Milhaud & Pierre-Emmanuel Thérond, 2016. "Tree-based censored regression with applications in insurance," Post-Print hal-01141228, HAL.
    3. Qingguo Tang & R. J. Karunamuni, 2018. "Robust variable selection for finite mixture regression models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(3), pages 489-521, June.
    4. García, A., 2016. "Oaxaca-Blinder Type Counterfactual Decomposition Methods for Duration Outcomes," Documentos de Trabajo 14186, Universidad del Rosario.
    5. Jacobo de Uña-Álvarez & Luís Meira-Machado, 2015. "Nonparametric estimation of transition probabilities in the non-Markov illness-death model: A comparative study," Biometrics, The International Biometric Society, vol. 71(2), pages 364-375, June.
    6. Liang, Weijuan & Zhang, Qingzhao & Ma, Shuangge, 2024. "Hierarchical false discovery rate control for high-dimensional survival analysis with interactions," Computational Statistics & Data Analysis, Elsevier, vol. 192(C).
    7. Hang Zou & Xiaowen Huang & Yunlu Jiang, 2025. "Robust variable selection for additive coefficient models," Computational Statistics, Springer, vol. 40(2), pages 977-997, February.
    8. Jad Beyhum, 2021. "Two-stage least squares with a randomly right censored outcome," Papers 2110.05107, arXiv.org.
    9. Gabriela Ciuperca, 2018. "Test by adaptive LASSO quantile method for real-time detection of a change-point," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 81(6), pages 689-720, August.
    10. Olivier Lopez & Xavier Milhaud & Pierre-Emmanuel Thérond, 2016. "Tree-based censored regression with applications in insurance," Post-Print hal-01364437, HAL.
    11. Zhiping Qiu & Jing Qin & Yong Zhou, 2016. "Composite Estimating Equation Method for the Accelerated Failure Time Model with Length-biased Sampling Data," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(2), pages 396-415, June.
    12. Pedro H. C. Sant’Anna, 2021. "Nonparametric Tests for Treatment Effect Heterogeneity With Duration Outcomes," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 39(3), pages 816-832, July.
    13. de Uña-Álvarez, Jacobo & Meira-Machado, Luis F., 2008. "A simple estimator of the bivariate distribution function for censored gap times," Statistics & Probability Letters, Elsevier, vol. 78(15), pages 2440-2445, October.
    14. Wenceslao González-Manteiga & Rosa Crujeiras, 2013. "An updated review of Goodness-of-Fit tests for regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 22(3), pages 361-411, September.
    15. Jad Beyhum & Ingrid Keilegom, 2023. "Robust censored regression with $$\ell _1$$ ℓ 1 -norm regularization," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(1), pages 146-162, March.
    16. Wang Zhu & Wang C.Y., 2010. "Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-33, June.
    17. Bao, Yanchun & He, Shuyuan & Mei, Changlin, 2007. "The Koul-Susarla-Van Ryzin and weighted least squares estimates for censored linear regression model: A comparative study," Computational Statistics & Data Analysis, Elsevier, vol. 51(12), pages 6488-6497, August.
    18. Liya Fu & Zhuoran Yang & Fengjing Cai & You-Gan Wang, 2021. "Efficient and doubly-robust methods for variable selection and parameter estimation in longitudinal data analysis," Computational Statistics, Springer, vol. 36(2), pages 781-804, June.
    19. Lopez, O. & Patilea, V., 2009. "Nonparametric lack-of-fit tests for parametric mean-regression models with censored data," Journal of Multivariate Analysis, Elsevier, vol. 100(1), pages 210-230, January.
    20. Paul Janssen & Noël Veraverbeke, 2024. "Nonparametric estimation of univariate and bivariate survival functions under right censoring: a survey," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 87(3), pages 211-245, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ecosta:v:4:y:2017:i:c:p:105-120. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/econometrics-and-statistics .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.