IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v267y2018i2p687-699.html
   My bibliography  Save this article

A support vector machine-based ensemble algorithm for breast cancer diagnosis

Author

Listed:
  • Wang, Haifeng
  • Zheng, Bichen
  • Yoon, Sang Won
  • Ko, Hoo Sang

Abstract

This research studies a support vector machine (SVM)-based ensemble learning algorithm for breast cancer diagnosis. Illness diagnosis plays a critical role in designating treatment strategies, which are highly related to patient safety. Nowadays, numerous classification models in data mining domains are adapted to breast cancer diagnosis based on patients’ historical medical records. However, the performance of each algorithm depends on various model configurations, such as input feature types and model parameters. To tackle the limitation of individual model performance, this research focuses on breast cancer diagnosis that uses an SVM-based ensemble learning algorithm to reduce the diagnosis variance and increase diagnosis accuracy. Twelve different SVMs, based on the proposed Weighted Area Under the Receiver Operating Characteristic Curve Ensemble (WAUCE) approach, are hybridized. To evaluate the performance of the proposed model, Wisconsin Breast Cancer, Wisconsin Diagnostic Breast Cancer, and the U.S. National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) program breast cancer datasets have been studied. The experimental results show that the WAUCE model achieves a higher accuracy with a significantly lower variance for breast cancer diagnosis compared to five other ensemble mechanisms and two common ensemble models, i.e., adaptive boosting and bagging classification tree. The proposed WAUCE model reduces the variance by 97.89% and increases accuracy by 33.34%, compared to the best single SVM model on the SEER dataset. In practice, the proposed methodology can be further applied to other illness diagnoses, which offers an alternative to a safer, more reliable, and more robust illness diagnosis process.

Suggested Citation

  • Wang, Haifeng & Zheng, Bichen & Yoon, Sang Won & Ko, Hoo Sang, 2018. "A support vector machine-based ensemble algorithm for breast cancer diagnosis," European Journal of Operational Research, Elsevier, vol. 267(2), pages 687-699.
  • Handle: RePEc:eee:ejores:v:267:y:2018:i:2:p:687-699
    DOI: 10.1016/j.ejor.2017.12.001
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221717310810
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2017.12.001?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Saba Bashir & Usman Qamar & Farhan Khan, 2015. "Heterogeneous classifiers fusion for dynamic breast cancer diagnosis using weighted vote based ensemble," Quality & Quantity: International Journal of Methodology, Springer, vol. 49(5), pages 2061-2076, September.
    2. Ravi, V. & Zimmermann, H. -J., 2000. "Fuzzy rule based classification with FeatureSelector and modified threshold accepting," European Journal of Operational Research, Elsevier, vol. 123(1), pages 16-28, May.
    3. Ravi, V. & Reddy, P. J. & Zimmermann, H. -J., 2000. "Pattern classification with principal component analysis and fuzzy rule bases," European Journal of Operational Research, Elsevier, vol. 126(3), pages 526-533, November.
    4. West, David & Mangiameli, Paul & Rampal, Rohit & West, Vivian, 2005. "Ensemble strategies for a medical diagnostic decision support system: A breast cancer diagnosis application," European Journal of Operational Research, Elsevier, vol. 162(2), pages 532-551, April.
    5. Olvi L. Mangasarian & W. Nick Street & William H. Wolberg, 1995. "Breast Cancer Diagnosis and Prognosis Via Linear Programming," Operations Research, INFORMS, vol. 43(4), pages 570-577, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Onur Demiray & Evrim D. Gunes & Ercan Kulak & Emrah Dogan & Seyma Gorcin Karaketir & Serap Cifcili & Mehmet Akman & Sibel Sakarya, 2023. "Classification of patients with chronic disease by activation level using machine learning methods," Health Care Management Science, Springer, vol. 26(4), pages 626-650, December.
    2. Blanquero, R. & Carrizosa, E. & Jiménez-Cordero, A. & Martín-Barragán, B., 2019. "Functional-bandwidth kernel for Support Vector Machine with Functional Data: An alternating optimization algorithm," European Journal of Operational Research, Elsevier, vol. 275(1), pages 195-207.
    3. Qifa Xu & Zezhou Wang & Cuixia Jiang & Yezheng Liu, 2023. "Deep learning on mixed frequency data," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(8), pages 2099-2120, December.
    4. Sarah N. Alyami & Sunday O. Olatunji, 2020. "Application of Support Vector Machine for Arabic Sentiment Classification Using Twitter-Based Dataset," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 19(01), pages 1-13, April.
    5. Astorino, Annabella & Avolio, Matteo & Fuduli, Antonio, 2022. "A maximum-margin multisphere approach for binary Multiple Instance Learning," European Journal of Operational Research, Elsevier, vol. 299(2), pages 642-652.
    6. P. K. Viswanathan & Sandeep Srivathsan & Wayne L. Winston, 2022. "Multiclass Discriminant Analysis using Ensemble Technique: Case Illustration from the Banking Industry," Journal of Emerging Market Finance, Institute for Financial Management and Research, vol. 21(1), pages 92-115, March.
    7. Abdur Rasool & Chayut Bunterngchit & Luo Tiejian & Md. Ruhul Islam & Qiang Qu & Qingshan Jiang, 2022. "Improved Machine Learning-Based Predictive Models for Breast Cancer Diagnosis," IJERPH, MDPI, vol. 19(6), pages 1-19, March.
    8. Meshwa Rameshbhai Savalia & Jaiprakash Vinodkumar Verma, 2023. "Classifying Malignant and Benign Tumors of Breast Cancer: A Comparative Investigation Using Machine Learning Techniques," International Journal of Reliable and Quality E-Healthcare (IJRQEH), IGI Global, vol. 12(1), pages 1-19, January.
    9. Baldomero-Naranjo, Marta & Martínez-Merino, Luisa I. & Rodríguez-Chía, Antonio M., 2020. "Tightening big Ms in integer programming formulations for support vector machines with ramp loss," European Journal of Operational Research, Elsevier, vol. 286(1), pages 84-100.
    10. Chen, Weiyi & Zhang, Limao, 2022. "An automated machine learning approach for earthquake casualty rate and economic loss prediction," Reliability Engineering and System Safety, Elsevier, vol. 225(C).
    11. Golmohammadi, Davood & Zhao, Lingyu & Dreyfus, David, 2023. "Using machine learning techniques to reduce uncertainty for outpatient appointment scheduling practices in outpatient clinics," Omega, Elsevier, vol. 120(C).
    12. Liang, Xijun & Zhang, Zhipeng & Song, Yunquan & Jian, Ling, 2022. "Kernel-based online regression with canal loss," European Journal of Operational Research, Elsevier, vol. 297(1), pages 268-279.
    13. Che Xu & Wenjun Chang & Weiyong Liu, 2023. "Data-driven decision model based on local two-stage weighted ensemble learning," Annals of Operations Research, Springer, vol. 325(2), pages 995-1028, June.
    14. Kamyab Karimi & Ali Ghodratnama & Reza Tavakkoli-Moghaddam, 2023. "Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: a comprehensive analysis," Annals of Operations Research, Springer, vol. 328(1), pages 665-700, September.
    15. Li, Yanying & Che, Jinxing & Yang, Youlong, 2018. "Subsampled support vector regression ensemble for short term electric load forecasting," Energy, Elsevier, vol. 164(C), pages 160-170.
    16. Joanna Błajda & Edyta Barnaś & Anna Kucab, 2022. "Application of Personalized Education in the Mobile Medical App for Breast Self-Examination," IJERPH, MDPI, vol. 19(8), pages 1-21, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Brandner, Hubertus & Lessmann, Stefan & Voß, Stefan, 2013. "A memetic approach to construct transductive discrete support vector machines," European Journal of Operational Research, Elsevier, vol. 230(3), pages 581-595.
    2. Hu, Yi-Chung, 2006. "A knowledge acquisition method for determining utilities of linguistic values for product factors," European Journal of Operational Research, Elsevier, vol. 174(2), pages 945-958, October.
    3. Derhami, Shahab & Smith, Alice E., 2017. "An integer programming approach for fuzzy rule-based classification systems," European Journal of Operational Research, Elsevier, vol. 256(3), pages 924-934.
    4. Wang, Xin & Liu, Xiaodong & Pedrycz, Witold & Zhu, Xiaolei & Hu, Guangfei, 2012. "Mining axiomatic fuzzy set association rules for classification problems," European Journal of Operational Research, Elsevier, vol. 218(1), pages 202-210.
    5. Sultan Almotairi & Elsayed Badr & Mustafa Abdul Salam & Hagar Ahmed, 2023. "Breast Cancer Diagnosis Using a Novel Parallel Support Vector Machine with Harris Hawks Optimization," Mathematics, MDPI, vol. 11(14), pages 1-25, July.
    6. Ravi, V. & Reddy, P. J. & Zimmermann, H. -J., 2000. "Pattern classification with principal component analysis and fuzzy rule bases," European Journal of Operational Research, Elsevier, vol. 126(3), pages 526-533, November.
    7. Meshwa Rameshbhai Savalia & Jaiprakash Vinodkumar Verma, 2023. "Classifying Malignant and Benign Tumors of Breast Cancer: A Comparative Investigation Using Machine Learning Techniques," International Journal of Reliable and Quality E-Healthcare (IJRQEH), IGI Global, vol. 12(1), pages 1-19, January.
    8. Yaqiong Cui & Jukka Sirén & Timo Koski & Jukka Corander, 2016. "Simultaneous Predictive Gaussian Classifiers," Journal of Classification, Springer;The Classification Society, vol. 33(1), pages 73-102, April.
    9. Liu, Qiang, 2021. "Reliability evaluation of two-stage evidence classification system considering preference and error," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    10. Yi Du & Hua Yu & Zhijun Li, 0. "Research of SVM ensembles in medical examination scheduling," Journal of Combinatorial Optimization, Springer, vol. 0, pages 1-11.
    11. B Baesens & C Mues & D Martens & J Vanthienen, 2009. "50 years of data mining and OR: upcoming trends and challenges," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 60(1), pages 16-23, May.
    12. Sahin, Özge & Czado, Claudia, 2022. "Vine copula mixture models and clustering for non-Gaussian data," Econometrics and Statistics, Elsevier, vol. 22(C), pages 136-158.
    13. Pedro Duarte Silva, A., 2017. "Optimization approaches to Supervised Classification," European Journal of Operational Research, Elsevier, vol. 261(2), pages 772-788.
    14. Sung, Bongjung & Lee, Jaeyong, 2023. "Covariance structure estimation with Laplace approximation," Journal of Multivariate Analysis, Elsevier, vol. 198(C).
    15. Wang, Wan-Lun, 2015. "Mixtures of common t-factor analyzers for modeling high-dimensional data with missing values," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 223-235.
    16. Jun-Ya Gotoh & Michael Jong Kim & Andrew E. B. Lim, 2017. "Calibration of Distributionally Robust Empirical Optimization Models," Papers 1711.06565, arXiv.org, revised May 2020.
    17. West, David & Mangiameli, Paul & Rampal, Rohit & West, Vivian, 2005. "Ensemble strategies for a medical diagnostic decision support system: A breast cancer diagnosis application," European Journal of Operational Research, Elsevier, vol. 162(2), pages 532-551, April.
    18. Michel H. Montoril & Woojin Chang & Brani Vidakovic, 2019. "Wavelet-Based Estimation of Generalized Discriminant Functions," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 81(2), pages 318-349, December.
    19. Giovanni Felici & Klaus Truemper, 2002. "A MINSAT Approach for Learning in Logic Domains," INFORMS Journal on Computing, INFORMS, vol. 14(1), pages 20-36, February.
    20. Vijayalakshmi S & John A & Sunder R & Senthilkumar Mohan & Sweta Bhattacharya & Rajesh Kaluri & Guang Feng & Usman Tariq, 2020. "Multi-modal prediction of breast cancer using particle swarm optimization with non-dominating sorting," International Journal of Distributed Sensor Networks, , vol. 16(11), pages 15501477209, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:267:y:2018:i:2:p:687-699. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.