IDEAS home Printed from https://ideas.repec.org/a/bjc/journl/v11y2024i5p808-824.html
   My bibliography  Save this article

Exploring Dimensionality Reduction Techniques for Improved Breast Cancer Diagnosis

Author

Listed:
  • Akampurira Paul

    (Kampala International University, Uganda)

  • Mutebi Joe

    (Kampala International University, Uganda)

  • Mugisha Brian

    (Kampala International University, Uganda)

  • Muhaise Hussein

    (Kampala International University, Uganda)

  • Kyomuhangi Rosette

    (Kampala International University, Uganda)

Abstract

A crucial area of medical study is the diagnosis of breast cancer, where managing the inherent complexity of high-dimensional information poses a challenge in addition to precise identification. In order to improve diagnostic accuracy, this research investigates dimensionality reduction strategies. This study’s main goal was to improve the accuracy and interpret ability of breast cancer diagnosis by using dimensionality reduction techniques. The goal of the study is to find significant patterns for useful diagnostic models by examining how preprocessing methods affect a high-dimensional dataset. Starting with a dataset including 569 observations and 30 attributes, careful examination reveals imbalances in the dataset (63% benign, 37% malignant). We used Pearson correlation coefficients to detect and eliminate highly correlated features in order to address multi collinearity. A subsequent adjustment of the data using min-max normalization guarantees consistent weighting. Then, for thorough dimensionality reduction, Principal Component Analysis (PCA) is employed. Screep lots and biplots are used to visually represent data, highlighting how well-suited early principle components are for separating benign from malignant instances. Our findings confirm the effectiveness of the procedure by showing a significant 24% decrease in data dimensionality. This work highlights the critical role that dimensionality reduction plays in improving breast cancer diagnosis for more precise, effective, and understandable models, and it calls for further investigation of the specific findings.

Suggested Citation

  • Akampurira Paul & Mutebi Joe & Mugisha Brian & Muhaise Hussein & Kyomuhangi Rosette, 2024. "Exploring Dimensionality Reduction Techniques for Improved Breast Cancer Diagnosis," International Journal of Research and Scientific Innovation, International Journal of Research and Scientific Innovation (IJRSI), vol. 11(5), pages 808-824, May.
  • Handle: RePEc:bjc:journl:v:11:y:2024:i:5:p:808-824
    as

    Download full text from publisher

    File URL: https://www.rsisinternational.org/journals/ijrsi/digital-library/volume-11-issue-5/808-824.pdf
    Download Restriction: no

    File URL: https://rsisinternational.org/journals/ijrsi/articles/exploring-dimensionality-reduction-techniques-for-improved-breast-cancer-diagnosis/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Saba Bashir & Usman Qamar & Farhan Khan, 2015. "Heterogeneous classifiers fusion for dynamic breast cancer diagnosis using weighted vote based ensemble," Quality & Quantity: International Journal of Methodology, Springer, vol. 49(5), pages 2061-2076, September.
    2. Wang, Haifeng & Zheng, Bichen & Yoon, Sang Won & Ko, Hoo Sang, 2018. "A support vector machine-based ensemble algorithm for breast cancer diagnosis," European Journal of Operational Research, Elsevier, vol. 267(2), pages 687-699.
    3. Jagpreet Chhatwal & Oguzhan Alagoz & Elizabeth S. Burnside, 2010. "Optimal Breast Biopsy Decision-Making Based on Mammographic Features and Demographic Factors," Operations Research, INFORMS, vol. 58(6), pages 1577-1591, December.
    4. Bingtao Zhang & Peng Cao, 2019. "Classification of high dimensional biomedical data based on feature selection using redundant removal," PLOS ONE, Public Library of Science, vol. 14(4), pages 1-19, April.
    5. Joshua T. Vogelstein & Eric W. Bridgeford & Minh Tang & Da Zheng & Christopher Douville & Randal Burns & Mauro Maggioni, 2021. "Supervised dimensionality reduction for big data," Nature Communications, Nature, vol. 12(1), pages 1-9, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Meshwa Rameshbhai Savalia & Jaiprakash Vinodkumar Verma, 2023. "Classifying Malignant and Benign Tumors of Breast Cancer: A Comparative Investigation Using Machine Learning Techniques," International Journal of Reliable and Quality E-Healthcare (IJRQEH), IGI Global, vol. 12(1), pages 1-19, January.
    2. Ainun Hasanah & Jing Wu, 2025. "Bibliometric analysis and global research trends of climate change and cities studies for 30 years (1990–2021)," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 27(3), pages 5573-5617, March.
    3. Jing Li & Ming Dong & Yijiong Ren & Kaiqi Yin, 2015. "How patient compliance impacts the recommendations for colorectal cancer screening," Journal of Combinatorial Optimization, Springer, vol. 30(4), pages 920-937, November.
    4. Elliot Lee & Mariel Lavieri & Michael Volk & Yongcai Xu, 2015. "Applying reinforcement learning techniques to detect hepatocellular carcinoma under limited screening capacity," Health Care Management Science, Springer, vol. 18(3), pages 363-375, September.
    5. Baruch Keren & Joseph Pliskin, 2011. "Optimal timing of joint replacement using mathematical programming and stochastic programming models," Health Care Management Science, Springer, vol. 14(4), pages 361-369, November.
    6. Gemma Turon & Jason Hlozek & John G. Woodland & Ankur Kumar & Kelly Chibale & Miquel Duran-Frigola, 2023. "First fully-automated AI/ML virtual screening cascade implemented at a drug discovery centre in Africa," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    7. Sultan Almotairi & Elsayed Badr & Mustafa Abdul Salam & Hagar Ahmed, 2023. "Breast Cancer Diagnosis Using a Novel Parallel Support Vector Machine with Harris Hawks Optimization," Mathematics, MDPI, vol. 11(14), pages 1-25, July.
    8. Oguzhan Alagoz & Jagpreet Chhatwal & Elizabeth S. Burnside, 2013. "Optimal Policies for Reducing Unnecessary Follow-Up Mammography Exams in Breast Cancer Diagnosis," Decision Analysis, INFORMS, vol. 10(3), pages 200-224, September.
    9. Robert Kraig Helmeczi & Can Kavaklioglu & Mucahit Cevik & Davood Pirayesh Neghab, 2023. "A multi-objective constrained partially observable Markov decision process model for breast cancer screening," Operational Research, Springer, vol. 23(2), pages 1-42, June.
    10. Malek Ebadi & Raha Akhavan-Tabatabaei, 2021. "Personalized Cotesting Policies for Cervical Cancer Screening: A POMDP Approach," Mathematics, MDPI, vol. 9(6), pages 1-20, March.
    11. Astorino, Annabella & Avolio, Matteo & Fuduli, Antonio, 2022. "A maximum-margin multisphere approach for binary Multiple Instance Learning," European Journal of Operational Research, Elsevier, vol. 299(2), pages 642-652.
    12. Liu, Qiang, 2021. "Reliability evaluation of two-stage evidence classification system considering preference and error," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    13. Onur Demiray & Evrim D. Gunes & Ercan Kulak & Emrah Dogan & Seyma Gorcin Karaketir & Serap Cifcili & Mehmet Akman & Sibel Sakarya, 2023. "Classification of patients with chronic disease by activation level using machine learning methods," Health Care Management Science, Springer, vol. 26(4), pages 626-650, December.
    14. Blanquero, R. & Carrizosa, E. & Jiménez-Cordero, A. & Martín-Barragán, B., 2019. "Functional-bandwidth kernel for Support Vector Machine with Functional Data: An alternating optimization algorithm," European Journal of Operational Research, Elsevier, vol. 275(1), pages 195-207.
    15. David D. Cho & Kurt M. Bretthauer & Jan Schoenfelder, 2023. "Patient-to-nurse ratios: Balancing quality, nurse turnover, and cost," Health Care Management Science, Springer, vol. 26(4), pages 807-826, December.
    16. Mehmet U. S. Ayvaci & Oguzhan Alagoz & Elizabeth S. Burnside, 2012. "The Effect of Budgetary Restrictions on Breast Cancer Diagnostic Decisions," Manufacturing & Service Operations Management, INFORMS, vol. 14(4), pages 600-617, October.
    17. Aboubacry Gaye & Abdou Ka Diongue & Seydou Nourou Sylla & Maryam Diarra & Amadou Diallo & Cheikh Talla & Cheikh Loucoubar, 2024. "Supervised Classification of High-Dimensional Correlated Data: Application to Genomic Data," Journal of Classification, Springer;The Classification Society, vol. 41(1), pages 158-169, March.
    18. Mehmet A. Ergun & Ali Hajjar & Oguzhan Alagoz & Murtuza Rampurwala, 2022. "Optimal breast cancer risk reduction policies tailored to personal risk level," Health Care Management Science, Springer, vol. 25(3), pages 363-388, September.
    19. Wesley J. Marrero & Mariel S. Lavieri & Jeremy B. Sussman, 2021. "Optimal cholesterol treatment plans and genetic testing strategies for cardiovascular diseases," Health Care Management Science, Springer, vol. 24(1), pages 1-25, March.
    20. Wang, Haifeng & Zheng, Bichen & Yoon, Sang Won & Ko, Hoo Sang, 2018. "A support vector machine-based ensemble algorithm for breast cancer diagnosis," European Journal of Operational Research, Elsevier, vol. 267(2), pages 687-699.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bjc:journl:v:11:y:2024:i:5:p:808-824. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Dr. Renu Malsaria (email available below). General contact details of provider: https://rsisinternational.org/journals/ijrsi/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.