IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v55y2011i3p1342-1356.html
   My bibliography  Save this article

A Bayesian hybrid Huberized support vector machine and its applications in high-dimensional medical data

Author

Listed:
  • Chakraborty, Sounak
  • Guo, Ruixin

Abstract

A hybrid Huberized support vector machine (HHSVM) with an elastic-net penalty has been developed for cancer tumor classification based on thousands of gene expression measurements. In this paper, we develop a Bayesian formulation of the hybrid Huberized support vector machine for binary classification. For the coefficients of the linear classification boundary, we propose a new type of prior, which can select variables and group them together simultaneously. Our proposed prior is a scale mixture of normal distributions and independent gamma priors on a transformation of the variance of the normal distributions. We establish a direct connection between the Bayesian HHSVM model with our special prior and the standard HHSVM solution with the elastic-net penalty. We propose a hierarchical Bayes technique and an empirical Bayes technique to select the penalty parameter. In the hierarchical Bayes model, the penalty parameter is selected using a beta prior. For the empirical Bayes model, we estimate the penalty parameter by maximizing the marginal likelihood. The proposed model is applied to two simulated data sets and three real-life gene expression microarray data sets. Results suggest that our Bayesian models are highly successful in selecting groups of similarly behaved important genes and predicting the cancer class. Most of the genes selected by our models have shown strong association with well-studied genetic pathways, further validating our claims.

Suggested Citation

  • Chakraborty, Sounak & Guo, Ruixin, 2011. "A Bayesian hybrid Huberized support vector machine and its applications in high-dimensional medical data," Computational Statistics & Data Analysis, Elsevier, vol. 55(3), pages 1342-1356, March.
  • Handle: RePEc:eee:csdana:v:55:y:2011:i:3:p:1342-1356
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(10)00367-1
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Chakraborty, Sounak, 2009. "Bayesian binary kernel probit model for microarray based cancer classification and gene selection," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4198-4209, October.
    2. Park, Trevor & Casella, George, 2008. "The Bayesian Lasso," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 681-686, June.
    3. Dudoit S. & Fridlyand J. & Speed T. P, 2002. "Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 77-87, March.
    4. Tibshirani Robert J., 2009. "Univariate Shrinkage in the Cox Model for High Dimensional Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-18, April.
    5. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    6. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    7. Lee, Yoonkyung & Lin, Yi & Wahba, Grace, 2004. "Multicategory Support Vector Machines: Theory and Application to the Classification of Microarray Data and Satellite Radiance Data," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 67-81, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Pedro Duarte Silva, A., 2011. "Two-group classification with high-dimensional correlated data: A factor model approach," Computational Statistics & Data Analysis, Elsevier, vol. 55(11), pages 2975-2990, November.
    2. Mallick, Himel & Yi, Nengjun, 2017. "Bayesian group bridge for bi-level variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 110(C), pages 115-133.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lee, Kyu Ha & Chakraborty, Sounak & Sun, Jianguo, 2017. "Variable selection for high-dimensional genomic data with censored outcomes using group lasso prior," Computational Statistics & Data Analysis, Elsevier, vol. 112(C), pages 1-13.
    2. Bilin Zeng & Xuerong Meggie Wen & Lixing Zhu, 2017. "A link-free sparse group variable selection method for single-index model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(13), pages 2388-2400, October.
    3. Philip D. Waggoner & Alec Macmillen, 2022. "Pursuing open-source development of predictive algorithms: the case of criminal sentencing algorithms," Journal of Computational Social Science, Springer, vol. 5(1), pages 89-109, May.
    4. Feihan Lu & Yao Zheng & Harrington Cleveland & Chris Burton & David Madigan, 2018. "Bayesian hierarchical vector autoregressive models for patient-level predictive modeling," PLOS ONE, Public Library of Science, vol. 13(12), pages 1-27, December.
    5. Gilles Celeux & Mohammed El Anbari & Jean-Michel Marin & Christian P. Robert, 2010. "Regularization in Regression : Comparing Bayesian and Frequentist Methods in a Poorly Informative Situation," Working Papers 2010-43, Center for Research in Economics and Statistics.
    6. Shutes, Karl & Adcock, Chris, 2013. "Regularized Extended Skew-Normal Regression," MPRA Paper 58445, University Library of Munich, Germany, revised 09 Sep 2014.
    7. Lee Kyu Ha & Chakraborty Sounak & Sun Jianguo, 2011. "Bayesian Variable Selection in Semiparametric Proportional Hazards Model for High Dimensional Survival Data," The International Journal of Biostatistics, De Gruyter, vol. 7(1), pages 1-32, April.
    8. Korobilis, Dimitris, 2013. "Hierarchical shrinkage priors for dynamic regressions with many predictors," International Journal of Forecasting, Elsevier, vol. 29(1), pages 43-59.
    9. Yu-Zhu Tian & Man-Lai Tang & Wai-Sum Chan & Mao-Zai Tian, 2021. "Bayesian bridge-randomized penalized quantile regression for ordinal longitudinal data, with application to firm’s bond ratings," Computational Statistics, Springer, vol. 36(2), pages 1289-1319, June.
    10. Billio, Monica & Casarin, Roberto & Rossini, Luca, 2019. "Bayesian nonparametric sparse VAR models," Journal of Econometrics, Elsevier, vol. 212(1), pages 97-115.
    11. Manisha Sanjay Sirsat & Paula Rodrigues Oblessuc & Ricardo S. Ramiro, 2022. "Genomic Prediction of Wheat Grain Yield Using Machine Learning," Agriculture, MDPI, vol. 12(9), pages 1-12, September.
    12. Chakraborty, Sounak, 2009. "Bayesian binary kernel probit model for microarray based cancer classification and gene selection," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4198-4209, October.
    13. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    14. Yagli, Gokhan Mert & Yang, Dazhi & Srinivasan, Dipti, 2019. "Automatic hourly solar forecasting using machine learning models," Renewable and Sustainable Energy Reviews, Elsevier, vol. 105(C), pages 487-498.
    15. Brendan P. W. Ames & Mingyi Hong, 2016. "Alternating direction method of multipliers for penalized zero-variance discriminant analysis," Computational Optimization and Applications, Springer, vol. 64(3), pages 725-754, July.
    16. Philip Kostov & Thankom Arun & Samuel Annim, 2014. "Financial Services to the Unbanked: the case of the Mzansi intervention in South Africa," Contemporary Economics, University of Economics and Human Sciences in Warsaw., vol. 8(2), June.
    17. Ruggieri, Eric & Lawrence, Charles E., 2012. "On efficient calculations for Bayesian variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1319-1332.
    18. Olivier Collignon & Jeongseop Han & Hyungmi An & Seungyoung Oh & Youngjo Lee, 2018. "Comparison of the modified unbounded penalty and the LASSO to select predictive genes of response to chemotherapy in breast cancer," PLOS ONE, Public Library of Science, vol. 13(10), pages 1-15, October.
    19. Mogliani, Matteo & Simoni, Anna, 2021. "Bayesian MIDAS penalized regressions: Estimation, selection, and prediction," Journal of Econometrics, Elsevier, vol. 222(1), pages 833-860.
    20. Gilles Charmet & Louis-Gautier Tran & Jérôme Auzanneau & Renaud Rincent & Sophie Bouchet, 2020. "BWGS: A R package for genomic selection and its application to a wheat breeding programme," PLOS ONE, Public Library of Science, vol. 15(4), pages 1-20, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:55:y:2011:i:3:p:1342-1356. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.