Improved Machine Learning-Based Predictive Models for Breast Cancer Diagnosis

My bibliography Save this article

Improved Machine Learning-Based Predictive Models for Breast Cancer Diagnosis

Author

Listed:

Abdur Rasool
(University of Chinese Academy of Sciences, Beijing 101408, China
Shenzhen Key Lab for High Performance Data Mining, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
These authors contributed equally to this work.)
Chayut Bunterngchit
(University of Chinese Academy of Sciences, Beijing 101408, China
State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
These authors contributed equally to this work.)
Luo Tiejian
(University of Chinese Academy of Sciences, Beijing 101408, China)
Md. Ruhul Islam
(Department of Electrical Engineering and Computer Science, University of Stavanger, 4044 Stavanger, Norway)
Qiang Qu
(Shenzhen Key Lab for High Performance Data Mining, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China)
Qingshan Jiang
(Shenzhen Key Lab for High Performance Data Mining, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China)

Registered:

Abstract

Breast cancer death rates are higher than any other cancer in American women. Machine learning-based predictive models promise earlier detection techniques for breast cancer diagnosis. However, making an evaluation for models that efficiently diagnose cancer is still challenging. In this work, we proposed data exploratory techniques (DET) and developed four different predictive models to improve breast cancer diagnostic accuracy. Prior to models, four-layered essential DET, e.g., feature distribution, correlation, elimination, and hyperparameter optimization, were deep-dived to identify the robust feature classification into malignant and benign classes. These proposed techniques and classifiers were implemented on the Wisconsin Diagnostic Breast Cancer (WDBC) and Breast Cancer Coimbra Dataset (BCCD) datasets. Standard performance metrics, including confusion matrices and K-fold cross-validation techniques, were applied to assess each classifier’s efficiency and training time. The models’ diagnostic capability improved with our DET, i.e., polynomial SVM gained 99.3%, LR with 98.06%, KNN acquired 97.35%, and EC achieved 97.61% accuracy with the WDBC dataset. We also compared our significant results with previous studies in terms of accuracy. The implementation procedure and findings can guide physicians to adopt an effective model for a practical understanding and prognosis of breast cancer tumors.

Suggested Citation

Abdur Rasool & Chayut Bunterngchit & Luo Tiejian & Md. Ruhul Islam & Qiang Qu & Qingshan Jiang, 2022. "Improved Machine Learning-Based Predictive Models for Breast Cancer Diagnosis," IJERPH, MDPI, vol. 19(6), pages 1-19, March.

Handle: RePEc:gam:jijerp:v:19:y:2022:i:6:p:3211-:d:767137

Download full text from publisher

References listed on IDEAS

Diva Cristina Morett Romano Leão & Eliane Ramos Pereira & María Nieves Pérez-Marfil & Rose Mary Costa Rosa Andrade Silva & Angelo Braga Mendonça & Renata Carla Nencetti Pereira Rocha & María Paz Garcí, 2021. "The Importance of Spirituality for Women Facing Breast Cancer Diagnosis: A Qualitative Study," IJERPH, MDPI, vol. 18(12), pages 1-11, June.
Wang, Haifeng & Zheng, Bichen & Yoon, Sang Won & Ko, Hoo Sang, 2018. "A support vector machine-based ensemble algorithm for breast cancer diagnosis," European Journal of Operational Research, Elsevier, vol. 267(2), pages 687-699.
Kwang Ho Park & Erdenebileg Batbaatar & Yongjun Piao & Nipon Theera-Umpon & Keun Ho Ryu, 2021. "Deep Learning Feature Extraction Approach for Hematopoietic Cancer Subtype Classification," IJERPH, MDPI, vol. 18(4), pages 1-24, February.
Eun Young Park & Myungsun Yi & Hye Sook Kim & Haejin Kim, 2021. "A Decision Tree Model for Breast Reconstruction of Women with Breast Cancer: A Mixed Method Approach," IJERPH, MDPI, vol. 18(7), pages 1-13, March.
Giulia Bicchierai & Federica Di Naro & Diego De Benedetto & Diletta Cozzi & Silvia Pradella & Vittorio Miele & Jacopo Nori, 2021. "A Review of Breast Imaging for Timely Diagnosis of Disease," IJERPH, MDPI, vol. 18(11), pages 1-16, May.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Tim Hulsen, 2022. "Data Science in Healthcare: COVID-19 and Beyond," IJERPH, MDPI, vol. 19(6), pages 1-4, March.
Ebtisam AlJalaud & Manar Hosny, 2024. "Enhancing Explainable Artificial Intelligence: Using Adaptive Feature Weight Genetic Explanation (AFWGE) with Pearson Correlation to Identify Crucial Feature Groups," Mathematics, MDPI, vol. 12(23), pages 1-48, November.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Ainun Hasanah & Jing Wu, 2025. "Bibliometric analysis and global research trends of climate change and cities studies for 30 years (1990–2021)," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 27(3), pages 5573-5617, March.
Jilong Zhang & Yuan Diao, 2024. "Hierarchical Learning-Enhanced Chaotic Crayfish Optimization Algorithm: Improving Extreme Learning Machine Diagnostics in Breast Cancer," Mathematics, MDPI, vol. 12(17), pages 1-26, August.
Joanna Błajda & Edyta Barnaś & Anna Kucab, 2022. "Application of Personalized Education in the Mobile Medical App for Breast Self-Examination," IJERPH, MDPI, vol. 19(8), pages 1-21, April.
Astorino, Annabella & Avolio, Matteo & Fuduli, Antonio, 2022. "A maximum-margin multisphere approach for binary Multiple Instance Learning," European Journal of Operational Research, Elsevier, vol. 299(2), pages 642-652.
Meshwa Rameshbhai Savalia & Jaiprakash Vinodkumar Verma, 2023. "Classifying Malignant and Benign Tumors of Breast Cancer: A Comparative Investigation Using Machine Learning Techniques," International Journal of Reliable and Quality E-Healthcare (IJRQEH), IGI Global Scientific Publishing, vol. 12(1), pages 1-19, January.
Baldomero-Naranjo, Marta & Martínez-Merino, Luisa I. & Rodríguez-Chía, Antonio M., 2020. "Tightening big Ms in integer programming formulations for support vector machines with ramp loss," European Journal of Operational Research, Elsevier, vol. 286(1), pages 84-100.
Onur Demiray & Evrim D. Gunes & Ercan Kulak & Emrah Dogan & Seyma Gorcin Karaketir & Serap Cifcili & Mehmet Akman & Sibel Sakarya, 2023. "Classification of patients with chronic disease by activation level using machine learning methods," Health Care Management Science, Springer, vol. 26(4), pages 626-650, December.
Blanquero, R. & Carrizosa, E. & Jiménez-Cordero, A. & Martín-Barragán, B., 2019. "Functional-bandwidth kernel for Support Vector Machine with Functional Data: An alternating optimization algorithm," European Journal of Operational Research, Elsevier, vol. 275(1), pages 195-207.
P. K. Viswanathan & Sandeep Srivathsan & Wayne L. Winston, 2022. "Multiclass Discriminant Analysis using Ensemble Technique: Case Illustration from the Banking Industry," Journal of Emerging Market Finance, Institute for Financial Management and Research, vol. 21(1), pages 92-115, March.
Golmohammadi, Davood & Zhao, Lingyu & Dreyfus, David, 2023. "Using machine learning techniques to reduce uncertainty for outpatient appointment scheduling practices in outpatient clinics," Omega, Elsevier, vol. 120(C).
Liang, Xijun & Zhang, Zhipeng & Song, Yunquan & Jian, Ling, 2022. "Kernel-based online regression with canal loss," European Journal of Operational Research, Elsevier, vol. 297(1), pages 268-279.
Kamyab Karimi & Ali Ghodratnama & Reza Tavakkoli-Moghaddam, 2023. "Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: a comprehensive analysis," Annals of Operations Research, Springer, vol. 328(1), pages 665-700, September.
Sarah N. Alyami & Sunday O. Olatunji, 2020. "Application of Support Vector Machine for Arabic Sentiment Classification Using Twitter-Based Dataset," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 19(01), pages 1-13, April.
Che Xu & Wenjun Chang & Weiyong Liu, 2023. "Data-driven decision model based on local two-stage weighted ensemble learning," Annals of Operations Research, Springer, vol. 325(2), pages 995-1028, June.
Maggioni, Francesca & Spinelli, Andrea, 2025. "A novel robust optimization model for nonlinear Support Vector Machine," European Journal of Operational Research, Elsevier, vol. 322(1), pages 237-253.
Akampurira Paul & Mutebi Joe & Mugisha Brian & Muhaise Hussein & Kyomuhangi Rosette, 2024. "Exploring Dimensionality Reduction Techniques for Improved Breast Cancer Diagnosis," International Journal of Research and Scientific Innovation, International Journal of Research and Scientific Innovation (IJRSI), vol. 11(5), pages 808-824, May.
Li, Yanying & Che, Jinxing & Yang, Youlong, 2018. "Subsampled support vector regression ensemble for short term electric load forecasting," Energy, Elsevier, vol. 164(C), pages 160-170.
Qifa Xu & Zezhou Wang & Cuixia Jiang & Yezheng Liu, 2023. "Deep learning on mixed frequency data," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(8), pages 2099-2120, December.
Chen, Weiyi & Zhang, Limao, 2022. "An automated machine learning approach for earthquake casualty rate and economic loss prediction," Reliability Engineering and System Safety, Elsevier, vol. 225(C).

More about this item

Keywords

; ; ; ;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:19:y:2022:i:6:p:3211-:d:767137. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Improved Machine Learning-Based Predictive Models for Breast Cancer Diagnosis

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data