IDEAS home Printed from https://ideas.repec.org/a/bla/jorssc/v70y2021i2p251-269.html
   My bibliography  Save this article

Finite mixtures of semiparametric Bayesian survival kernel machine regressions: Application to breast cancer gene pathway subgroup analysis

Author

Listed:
  • Lin Zhang
  • Inyoung Kim

Abstract

A gene pathway is defined as a set of genes that functionally work together to regulate a certain biological process. Gene pathway expression data, which is a special case of highly correlated high‐dimensional data, exhibits the ‘small n and large p’ problem. Pathway analysis can take into account the dependency structures among genes and the possibility that several moderately regulated genes may have significant impacts on the clinical outcomes. To test the significance of gene pathways in the presence of subgroups, we propose a finite mixture model of semiparametric Bayesian survival kernel machine regressions (fm‐BKSurv). Within each hidden group, we model the unknown function of gene pathways via a Gaussian kernel machine. We demonstrate how fm‐BKSurv excels in terms of true positive rate, false positive rate, accuracy, and precision in a simulation study, and further illustrate the outperformance of fm‐BKSurv in detecting significant gene pathways using a gene pathway expression dataset of breast cancer patients.

Suggested Citation

  • Lin Zhang & Inyoung Kim, 2021. "Finite mixtures of semiparametric Bayesian survival kernel machine regressions: Application to breast cancer gene pathway subgroup analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(2), pages 251-269, March.
  • Handle: RePEc:bla:jorssc:v:70:y:2021:i:2:p:251-269
    DOI: 10.1111/rssc.12457
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssc.12457
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssc.12457?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Allison, David B. & Gadbury, Gary L. & Heo, Moonseong & Fernandez, Jose R. & Lee, Cheol-Koo & Prolla, Tomas A. & Weindruch, Richard, 2002. "A mixture model approach for the analysis of microarray gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 39(1), pages 1-20, March.
    2. Tianxi Cai & Giulia Tonini & Xihong Lin, 2011. "Kernel Machine Approach to Testing the Significance of Multiple Genetic Markers for Risk Prediction," Biometrics, The International Biometric Society, vol. 67(3), pages 975-986, September.
    3. Luping Zhao & Timothy E. Hanson & Bradley P. Carlin, 2009. "Mixtures of Polya trees for flexible spatial frailty survival modelling," Biometrika, Biometrika Trust, vol. 96(2), pages 263-276.
    4. Dawei Liu & Xihong Lin & Debashis Ghosh, 2007. "Semiparametric Regression of Multidimensional Genetic Pathway Data: Least-Squares Kernel Machines and Linear Mixed Models," Biometrics, The International Biometric Society, vol. 63(4), pages 1079-1088, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Long Qu & Tobias Guennel & Scott L. Marshall, 2013. "Linear Score Tests for Variance Components in Linear Mixed Models and Applications to Genetic Association Studies," Biometrics, The International Biometric Society, vol. 69(4), pages 883-892, December.
    2. Ghosh, Debashis, 2014. "An asymptotically minimax kernel machine," Statistics & Probability Letters, Elsevier, vol. 95(C), pages 33-38.
    3. Cho, Youngjoo & Zhan, Xiang & Ghosh, Debashis, 2022. "Nonlinear predictive directions in clinical trials," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    4. Parrish, Rudolph S. & Spencer III, Horace J. & Xu, Ping, 2009. "Distribution modeling and simulation of gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1650-1660, March.
    5. Zaili Fang & Inyoung Kim & Jeesun Jung, 2018. "Semiparametric Kernel-Based Regression for Evaluating Interaction Between Pathway Effect and Covariate," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(1), pages 129-152, March.
    6. Luping Zhao & Timothy E. Hanson, 2011. "Spatially Dependent Polya Tree Modeling for Survival Data," Biometrics, The International Biometric Society, vol. 67(2), pages 391-403, June.
    7. Ghosh Debashis, 2012. "Incorporating the Empirical Null Hypothesis into the Benjamini-Hochberg Procedure," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(4), pages 1-21, July.
    8. Arnab Maity & Xihong Lin, 2011. "Powerful Tests for Detecting a Gene Effect in the Presence of Possible Gene–Gene Interactions Using Garrote Kernel Machines," Biometrics, The International Biometric Society, vol. 67(4), pages 1271-1284, December.
    9. Angela Schörgendorfer & Adam J. Branscum & Timothy E. Hanson, 2013. "A Bayesian Goodness of Fit Test and Semiparametric Generalization of Logistic Regression with Measurement Data," Biometrics, The International Biometric Society, vol. 69(2), pages 508-519, June.
    10. Teran Hidalgo, Sebastian J. & Wu, Michael C. & Engel, Stephanie M. & Kosorok, Michael R., 2018. "Goodness-of-fit test for nonparametric regression models: Smoothing spline ANOVA models as example," Computational Statistics & Data Analysis, Elsevier, vol. 122(C), pages 135-155.
    11. He, Yi & Pan, Wei & Lin, Jizhen, 2006. "Cluster analysis using multivariate normal mixture models to detect differential gene expression with microarray data," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 641-658, November.
    12. Wenjing Qi & Andrew S Allen & Yi-Ju Li, 2019. "Family-based association tests for rare variants with censored traits," PLOS ONE, Public Library of Science, vol. 14(1), pages 1-17, January.
    13. Chakraborty, Sounak, 2009. "Bayesian binary kernel probit model for microarray based cancer classification and gene selection," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4198-4209, October.
    14. Dehan Kong & Joseph G. Ibrahim & Eunjee Lee & Hongtu Zhu, 2018. "FLCRM: Functional linear cox regression model," Biometrics, The International Biometric Society, vol. 74(1), pages 109-117, March.
    15. Cheng, Cheng, 2009. "Internal validation inferences of significant genomic features in genome-wide screening," Computational Statistics & Data Analysis, Elsevier, vol. 53(3), pages 788-800, January.
    16. Fan, Caiyun & Lu, Wenbin & Zhou, Yong, 2021. "Testing error heterogeneity in censored linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 161(C).
    17. Robert R. Delongchamp & John F. Bowyer & James J. Chen & Ralph L. Kodell, 2004. "Multiple-Testing Strategy for Analyzing cDNA Array Data on Gene Expression," Biometrics, The International Biometric Society, vol. 60(3), pages 774-782, September.
    18. Yunxuan Jiang & Karen N. Conneely & Michael P. Epstein, 2018. "Robust Rare-Variant Association Tests for Quantitative Traits in General Pedigrees," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 10(3), pages 491-505, December.
    19. Xiang, Qinfang & Edwards, Jode & Gadbury, Gary L., 2006. "Interval estimation in a finite mixture model: Modeling P-values in multiple testing applications," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 570-586, November.
    20. Haiming Zhou & Timothy Hanson & Jiajia Zhang, 2017. "Generalized accelerated failure time spatial frailty model for arbitrarily censored data," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 23(3), pages 495-515, July.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssc:v:70:y:2021:i:2:p:251-269. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.