IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0014579.html
   My bibliography  Save this article

Optimization Based Tumor Classification from Microarray Gene Expression Data

Author

Listed:
  • Onur Dagliyan
  • Fadime Uney-Yuksektepe
  • I Halil Kavakli
  • Metin Turkay

Abstract

Background: An important use of data obtained from microarray measurements is the classification of tumor types with respect to genes that are either up or down regulated in specific cancer types. A number of algorithms have been proposed to obtain such classifications. These algorithms usually require parameter optimization to obtain accurate results depending on the type of data. Additionally, it is highly critical to find an optimal set of markers among those up or down regulated genes that can be clinically utilized to build assays for the diagnosis or to follow progression of specific cancer types. In this paper, we employ a mixed integer programming based classification algorithm named hyper-box enclosure method (HBE) for the classification of some cancer types with a minimal set of predictor genes. This optimization based method which is a user friendly and efficient classifier may allow the clinicians to diagnose and follow progression of certain cancer types. Methodology/Principal Findings: We apply HBE algorithm to some well known data sets such as leukemia, prostate cancer, diffuse large B-cell lymphoma (DLBCL), small round blue cell tumors (SRBCT) to find some predictor genes that can be utilized for diagnosis and prognosis in a robust manner with a high accuracy. Our approach does not require any modification or parameter optimization for each data set. Additionally, information gain attribute evaluator, relief attribute evaluator and correlation-based feature selection methods are employed for the gene selection. The results are compared with those from other studies and biological roles of selected genes in corresponding cancer type are described. Conclusions/Significance: The performance of our algorithm overall was better than the other algorithms reported in the literature and classifiers found in WEKA data-mining package. Since it does not require a parameter optimization and it performs consistently very high prediction rate on different type of data sets, HBE method is an effective and consistent tool for cancer type prediction with a small number of gene markers.

Suggested Citation

  • Onur Dagliyan & Fadime Uney-Yuksektepe & I Halil Kavakli & Metin Turkay, 2011. "Optimization Based Tumor Classification from Microarray Gene Expression Data," PLOS ONE, Public Library of Science, vol. 6(2), pages 1-10, February.
  • Handle: RePEc:plo:pone00:0014579
    DOI: 10.1371/journal.pone.0014579
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0014579
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0014579&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0014579?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Naijun Sha & Marina Vannucci & Mahlet G. Tadesse & Philip J. Brown & Ilaria Dragoni & Nick Davies & Tracy C. Roberts & Andrea Contestabile & Mike Salmon & Chris Buckley & Francesco Falciani, 2004. "Bayesian Variable Selection in Multinomial Probit Models to Identify Molecular Signatures of Disease Stage," Biometrics, The International Biometric Society, vol. 60(3), pages 812-819, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zhenqiu Liu & Dechang Chen & Li Sheng & Amy Y Liu, 2013. "Class Prediction and Feature Selection with Linear Optimization for Metagenomic Count Data," PLOS ONE, Public Library of Science, vol. 8(3), pages 1-7, March.
    2. Bertolazzi, P. & Felici, G. & Festa, P. & Fiscon, G. & Weitschek, E., 2016. "Integer programming models for feature selection: New extensions and a randomized solution algorithm," European Journal of Operational Research, Elsevier, vol. 250(2), pages 389-399.
    3. Fadime Üney-Yüksektepe, 2014. "A novel approach to cutting decision trees," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 22(3), pages 553-565, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Riccardo (Jack) Lucchetti & Luca Pedini, 2020. "ParMA: Parallelised Bayesian Model Averaging for Generalised Linear Models," Working Papers 2020:28, Department of Economics, University of Venice "Ca' Foscari".
    2. Victor Trevino & Mahlet G Tadesse & Marina Vannucci & Fatima Al-Shahrour & Philipp Antczak & Sarah Durant & Andreas Bikfalvi & Joaquin Dopazo & Moray J Campbell & Francesco Falciani, 2011. "Analysis of Normal-Tumour Tissue Interaction in Tumours: Prediction of Prostate Cancer Features from the Molecular Profile of Adjacent Normal Cells," PLOS ONE, Public Library of Science, vol. 6(3), pages 1-13, March.
    3. Lee Kyu Ha & Chakraborty Sounak & Sun Jianguo, 2011. "Bayesian Variable Selection in Semiparametric Proportional Hazards Model for High Dimensional Survival Data," The International Journal of Biostatistics, De Gruyter, vol. 7(1), pages 1-32, April.
    4. Chakraborty, Sounak, 2009. "Simultaneous cancer classification and gene selection with Bayesian nearest neighbor method: An integrated approach," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1462-1474, February.
    5. Chakraborty, Sounak, 2009. "Bayesian binary kernel probit model for microarray based cancer classification and gene selection," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4198-4209, October.
    6. Naijun Sha & Benard Owusu Dechi, 2019. "A Bayes Inference for Ordinal Response with Latent Variable Approach," Stats, MDPI, vol. 2(2), pages 1-11, June.
    7. Aijun Yang & Xuejun Jiang & Lianjie Shu & Jinguan Lin, 2017. "Bayesian variable selection with sparse and correlation priors for high-dimensional data analysis," Computational Statistics, Springer, vol. 32(1), pages 127-143, March.
    8. Alberto Cassese & Michele Guindani & Philipp Antczak & Francesco Falciani & Marina Vannucci, 2015. "A Bayesian model for the identification of differentially expressed genes in Daphnia magna exposed to munition pollutants," Biometrics, The International Biometric Society, vol. 71(3), pages 803-811, September.
    9. Baragatti, M. & Pommeret, D., 2012. "A study of variable selection using g-prior distribution with ridge parameter," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1920-1934.
    10. Nicolai Meinshausen & Peter Bühlmann, 2010. "Stability selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(4), pages 417-473, September.
    11. Aijun Yang & Yunxian Li & Niansheng Tang & Jinguan Lin, 2015. "Bayesian variable selection in multinomial probit model for classifying high-dimensional data," Computational Statistics, Springer, vol. 30(2), pages 399-418, June.
    12. Yang, Aijun & Jiang, Xuejun & Liu, Pengfei & Lin, Jinguan, 2016. "Sparse Bayesian multinomial probit regression model with correlation prior for high-dimensional data classification," Statistics & Probability Letters, Elsevier, vol. 119(C), pages 241-247.
    13. Lizhen Shen & Hua Jiang & Mingfang He & Guoqing Liu, 2017. "Collaborative representation-based classification of microarray gene expression data," PLOS ONE, Public Library of Science, vol. 12(12), pages 1-14, December.
    14. Shi, Guiling & Lim, Chae Young & Maiti, Tapabrata, 2019. "Bayesian model selection for generalized linear models using non-local priors," Computational Statistics & Data Analysis, Elsevier, vol. 133(C), pages 285-296.
    15. Chen, Kun & Jiang, Wenxin & Tanner, Martin A., 2010. "A note on some algorithms for the Gibbs posterior," Statistics & Probability Letters, Elsevier, vol. 80(15-16), pages 1234-1241, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0014579. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.