IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v328y2023i1d10.1007_s10479-022-04933-8.html
   My bibliography  Save this article

Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: a comprehensive analysis

Author

Listed:
  • Kamyab Karimi

    (Kharazmi University)

  • Ali Ghodratnama

    (Kharazmi University)

  • Reza Tavakkoli-Moghaddam

    (University of Tehran)

Abstract

In recent decades, breast cancer has become one of the leading causes of mortality among women. This disease is not preventable because of its unknown causes; however, its early diagnosis increases patients’ recovery chances. Machine learning (ML) can be utilized to improve treatment outcomes in healthcare operations while diminishing costs and time. In this research, we suggest two novel feature selection (FS) methods based upon an imperialist competitive algorithm (ICA) and a bat algorithm (BA) and their combination with ML algorithms. This study aims to enhance diagnostic models’ efficiency and present a comprehensive analysis to help clinical physicians make more precise and reliable decisions. K-nearest neighbors (KNN), support vector machine (SVM), decision tree (DT), Naive Bayes, AdaBoost (AB), linear discriminant analysis (LDA), random forest (RF), logistic regression (LR), and artificial neural network (ANN) are some of the methods employed. Sensitivity, accuracy, precision, mean absolute error F-score, root mean square error, Kappa, and relative absolute error calculated the performance of the methods. This paper applied a distinctive integration of evaluation measures and ML algorithms using the wrapper feature selection based on ICA (WFSIC) and BA (WFSB) separately. We compared two proposed approaches for the performance of the classifiers. Also, we compared our best diagnostic model with previous works reported in the literature survey. Experimentations were performed on the Wisconsin diagnostic breast cancer (WDBC) dataset. Results reveal that the proposed framework that uses the BA with an accuracy of 99.12% surpasses the framework using the ICA and most previous works. Additionally, the RF classifier in the approach of FS based on BA emerges as the best model and outperforms others regarding its criteria. Besides, the results illustrate the role of our techniques in reducing the dataset dimensions up to 90% and increasing the performance of diagnostic models by over 99%. Moreover, the result demonstrates that there are more critical features than the optimum dataset obtained by proposed FS approaches that have been selected by most ML models, including the standard error of area, concavity, smoothness, perimeter, the worst of texture, compactness, radius, symmetry, smoothness, concavity, and the mean of concave points, fractal dimension, compactness, concavity that can remarkably affect the efficiency of breast cancer prediction. This study illustrates the role of our approaches in enhancing treatment outcomes in healthcare operations.

Suggested Citation

  • Kamyab Karimi & Ali Ghodratnama & Reza Tavakkoli-Moghaddam, 2023. "Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: a comprehensive analysis," Annals of Operations Research, Springer, vol. 328(1), pages 665-700, September.
  • Handle: RePEc:spr:annopr:v:328:y:2023:i:1:d:10.1007_s10479-022-04933-8
    DOI: 10.1007/s10479-022-04933-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-022-04933-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-022-04933-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wang, Haifeng & Zheng, Bichen & Yoon, Sang Won & Ko, Hoo Sang, 2018. "A support vector machine-based ensemble algorithm for breast cancer diagnosis," European Journal of Operational Research, Elsevier, vol. 267(2), pages 687-699.
    2. Ya-Ju Fan & Wanpracha Chaovalitwongse, 2010. "Optimizing feature selection to improve medical diagnosis," Annals of Operations Research, Springer, vol. 174(1), pages 169-183, February.
    3. Catherine A. O’Brien & Aaron Pollett & Steven Gallinger & John E. Dick, 2007. "A human colon cancer cell capable of initiating tumour growth in immunodeficient mice," Nature, Nature, vol. 445(7123), pages 106-110, January.
    4. Bogumił Kamiński & Michał Jakubczyk & Przemysław Szufel, 2018. "A framework for sensitivity analysis of decision trees," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 26(1), pages 135-159, March.
    5. Marina Johnson & Abdullah Albizri & Serhat Simsek, 2022. "Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis," Annals of Operations Research, Springer, vol. 308(1), pages 275-305, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Praveen Puram & Soumya Roy & Deepak Srivastav & Anand Gurumurthy, 2023. "Understanding the effect of contextual factors and decision making on team performance in Twenty20 cricket: an interpretable machine learning approach," Annals of Operations Research, Springer, vol. 325(1), pages 261-288, June.
    2. Deac Dan Stelian & Schebesch Klaus Bruno, 2018. "Market Forecasts and Client Behavioral Data: Towards Finding Adequate Model Complexity," Studia Universitatis „Vasile Goldis” Arad – Economics Series, Sciendo, vol. 28(3), pages 50-75, September.
    3. Abdur Rasool & Chayut Bunterngchit & Luo Tiejian & Md. Ruhul Islam & Qiang Qu & Qingshan Jiang, 2022. "Improved Machine Learning-Based Predictive Models for Breast Cancer Diagnosis," IJERPH, MDPI, vol. 19(6), pages 1-19, March.
    4. Ivan Miguel Pires & Faisal Hussain & Nuno M. Garcia & Petre Lameski & Eftim Zdravevski, 2020. "Homogeneous Data Normalization and Deep Learning: A Case Study in Human Activity Classification," Future Internet, MDPI, vol. 12(11), pages 1-14, November.
    5. Joanna Błajda & Edyta Barnaś & Anna Kucab, 2022. "Application of Personalized Education in the Mobile Medical App for Breast Self-Examination," IJERPH, MDPI, vol. 19(8), pages 1-21, April.
    6. Alaleh Razmjoo & Petros Xanthopoulos & Qipeng Phil Zheng, 2019. "Feature importance ranking for classification in mixed online environments," Annals of Operations Research, Springer, vol. 276(1), pages 315-330, May.
    7. Astorino, Annabella & Avolio, Matteo & Fuduli, Antonio, 2022. "A maximum-margin multisphere approach for binary Multiple Instance Learning," European Journal of Operational Research, Elsevier, vol. 299(2), pages 642-652.
    8. Fieberg, Christian & Günther, Steffen & Poddig, Thorsten & Zaremba, Adam, 2024. "Non-standard errors in the cryptocurrency world," International Review of Financial Analysis, Elsevier, vol. 92(C).
    9. Michelle Sapitang & Wanie M. Ridwan & Khairul Faizal Kushiar & Ali Najah Ahmed & Ahmed El-Shafie, 2020. "Machine Learning Application in Reservoir Water Level Forecasting for Sustainable Hydropower Generation Strategy," Sustainability, MDPI, vol. 12(15), pages 1-19, July.
    10. Nikolay Bessonov & Guillaume Pinna & Andrey Minarsky & Annick Harel-Bellan & Nadya Morozova, 2019. "Mathematical modeling reveals the factors involved in the phenomena of cancer stem cells stabilization," PLOS ONE, Public Library of Science, vol. 14(11), pages 1-24, November.
    11. Meshwa Rameshbhai Savalia & Jaiprakash Vinodkumar Verma, 2023. "Classifying Malignant and Benign Tumors of Breast Cancer: A Comparative Investigation Using Machine Learning Techniques," International Journal of Reliable and Quality E-Healthcare (IJRQEH), IGI Global, vol. 12(1), pages 1-19, January.
    12. Baldomero-Naranjo, Marta & Martínez-Merino, Luisa I. & Rodríguez-Chía, Antonio M., 2020. "Tightening big Ms in integer programming formulations for support vector machines with ramp loss," European Journal of Operational Research, Elsevier, vol. 286(1), pages 84-100.
    13. Pinciroli, Luca & Baraldi, Piero & Zio, Enrico, 2023. "Maintenance optimization in industry 4.0," Reliability Engineering and System Safety, Elsevier, vol. 234(C).
    14. Onur Demiray & Evrim D. Gunes & Ercan Kulak & Emrah Dogan & Seyma Gorcin Karaketir & Serap Cifcili & Mehmet Akman & Sibel Sakarya, 2023. "Classification of patients with chronic disease by activation level using machine learning methods," Health Care Management Science, Springer, vol. 26(4), pages 626-650, December.
    15. Yikai Liu & Ruozheng Wu & Aimin Yang, 2023. "Research on Medical Problems Based on Mathematical Models," Mathematics, MDPI, vol. 11(13), pages 1-26, June.
    16. Blanquero, R. & Carrizosa, E. & Jiménez-Cordero, A. & Martín-Barragán, B., 2019. "Functional-bandwidth kernel for Support Vector Machine with Functional Data: An alternating optimization algorithm," European Journal of Operational Research, Elsevier, vol. 275(1), pages 195-207.
    17. P. K. Viswanathan & Sandeep Srivathsan & Wayne L. Winston, 2022. "Multiclass Discriminant Analysis using Ensemble Technique: Case Illustration from the Banking Industry," Journal of Emerging Market Finance, Institute for Financial Management and Research, vol. 21(1), pages 92-115, March.
    18. Fabrizio De Caro & Amedeo Andreotti & Rodolfo Araneo & Massimo Panella & Antonello Rosato & Alfredo Vaccaro & Domenico Villacci, 2020. "A Review of the Enabling Methodologies for Knowledge Discovery from Smart Grids Data," Energies, MDPI, vol. 13(24), pages 1-25, December.
    19. Tymoteusz Miller & Grzegorz Mikiciuk & Anna Kisiel & Małgorzata Mikiciuk & Dominika Paliwoda & Lidia Sas-Paszt & Danuta Cembrowska-Lech & Adrianna Krzemińska & Agnieszka Kozioł & Adam Brysiewicz, 2023. "Machine Learning Approaches for Forecasting the Best Microbial Strains to Alleviate Drought Impact in Agriculture," Agriculture, MDPI, vol. 13(8), pages 1-16, August.
    20. Yan Gu & Yanrong Chen & Lai Wei & Shuang Wu & Kaicheng Shen & Chengxiang Liu & Yan Dong & Yang Zhao & Yue Zhang & Chi Zhang & Wenling Zheng & Jiangyi He & Yunlong Wang & Yifei Li & Xiaoxin Zhao & Hong, 2021. "ABHD5 inhibits YAP-induced c-Met overexpression and colon cancer cell stemness via suppressing YAP methylation," Nature Communications, Nature, vol. 12(1), pages 1-15, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:328:y:2023:i:1:d:10.1007_s10479-022-04933-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.