IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0308531.html
   My bibliography  Save this article

Introducing effective genes in lymph node metastasis of breast cancer patients using SHAP values based on the mRNA expression data

Author

Listed:
  • Sepideh Zununi Vahed
  • Seyed Mahdi Hosseiniyan Khatibi
  • Yalda Rahbar Saadat
  • Manijeh Emdadi
  • Bahareh Khodaei
  • Mohammad Matin Alishani
  • Farnaz Boostani
  • Solmaz Maleki Dizaj
  • Saeed Pirmoradi

Abstract

Objective: Breast cancer, a global concern predominantly impacting women, poses a significant threat when not identified early. While survival rates for breast cancer patients are typically favorable, the emergence of regional metastases markedly diminishes survival prospects. Detecting metastases and comprehending their molecular underpinnings are crucial for tailoring effective treatments and improving patient survival outcomes. Methods: Various artificial intelligence methods and techniques were employed in this study to achieve accurate outcomes. Initially, the data was organized and underwent hold-out cross-validation, data cleaning, and normalization. Subsequently, feature selection was conducted using ANOVA and binary Particle Swarm Optimization (PSO). During the analysis phase, the discriminative power of the selected features was evaluated using machine learning classification algorithms. Finally, the selected features were considered, and the SHAP algorithm was utilized to identify the most significant features for enhancing the decoding of dominant molecular mechanisms in lymph node metastases. Results: In this study, five main steps were followed for the analysis of mRNA expression data: reading, preprocessing, feature selection, classification, and SHAP algorithm. The RF classifier utilized the candidate mRNAs to differentiate between negative and positive categories with an accuracy of 61% and an AUC of 0.6. During the SHAP process, intriguing relationships between the selected mRNAs and positive/negative lymph node status were discovered. The results indicate that GDF5, BAHCC1, LCN2, FGF14-AS2, and IDH2 are among the top five most impactful mRNAs based on their SHAP values. Conclusion: The prominent identified mRNAs including GDF5, BAHCC1, LCN2, FGF14-AS2, and IDH2, are implicated in lymph node metastasis. This study holds promise in elucidating a thorough insight into key candidate genes that could significantly impact the early detection and tailored therapeutic strategies for lymph node metastasis in patients with breast cancer.

Suggested Citation

  • Sepideh Zununi Vahed & Seyed Mahdi Hosseiniyan Khatibi & Yalda Rahbar Saadat & Manijeh Emdadi & Bahareh Khodaei & Mohammad Matin Alishani & Farnaz Boostani & Solmaz Maleki Dizaj & Saeed Pirmoradi, 2024. "Introducing effective genes in lymph node metastasis of breast cancer patients using SHAP values based on the mRNA expression data," PLOS ONE, Public Library of Science, vol. 19(8), pages 1-19, August.
  • Handle: RePEc:plo:pone00:0308531
    DOI: 10.1371/journal.pone.0308531
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0308531
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0308531&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0308531?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Bommert, Andrea & Sun, Xudong & Bischl, Bernd & Rahnenführer, Jörg & Lang, Michel, 2020. "Benchmark for filter methods for feature selection in high-dimensional classification data," Computational Statistics & Data Analysis, Elsevier, vol. 143(C).
    2. Anja Müller & Bernhard Homey & Hortensia Soto & Nianfeng Ge & Daniel Catron & Matthew E. Buchanan & Terri McClanahan & Erin Murphy & Wei Yuan & Stephan N. Wagner & Jose Luis Barrera & Alejandro Mohar , 2001. "Involvement of chemokine receptors in breast cancer metastasis," Nature, Nature, vol. 410(6824), pages 50-56, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Manuel Oviedo-de la Fuente & Carlos Cabo & Celestino Ordóñez & Javier Roca-Pardiñas, 2021. "A Distance Correlation Approach for Optimum Multiscale Selection in 3D Point Cloud Classification," Mathematics, MDPI, vol. 9(12), pages 1-19, June.
    2. repec:iim:iimawp:14638 is not listed on IDEAS
    3. Wen-Kuo Chen & Dalianus Riantama & Long-Sheng Chen, 2020. "Using a Text Mining Approach to Hear Voices of Customers from Social Media toward the Fast-Food Restaurant Industry," Sustainability, MDPI, vol. 13(1), pages 1-17, December.
    4. Krarti, Moncef & Aldubyan, Mohammad, 2021. "Review analysis of COVID-19 impact on electricity demand for residential buildings," Renewable and Sustainable Energy Reviews, Elsevier, vol. 143(C).
    5. Dhivya Elavarasan & Durai Raj Vincent P M & Kathiravan Srinivasan & Chuan-Yu Chang, 2020. "A Hybrid CFS Filter and RF-RFE Wrapper-Based Feature Extraction for Enhanced Agricultural Crop Yield Prediction Modeling," Agriculture, MDPI, vol. 10(9), pages 1-27, September.
    6. Isabel Tundidor & Marta Seijo-Vila & Sandra Blasco-Benito & María Rubert-Hernández & Sandra Adámez & Clara Andradas & Sara Manzano & Isabel Álvarez-López & Cristina Sarasqueta & María Villa-Morales & , 2023. "Identification of fatty acid amide hydrolase as a metastasis suppressor in breast cancer," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    7. van Zyl, Corne & Ye, Xianming & Naidoo, Raj, 2024. "Harnessing eXplainable artificial intelligence for feature selection in time series energy forecasting: A comparative analysis of Grad-CAM and SHAP," Applied Energy, Elsevier, vol. 353(PA).
    8. Hapfelmeier, Alexander & Hornung, Roman & Haller, Bernhard, 2023. "Efficient permutation testing of variable importance measures by the example of random forests," Computational Statistics & Data Analysis, Elsevier, vol. 181(C).
    9. Minglu Zhou & Chendong Liu & Bo Li & Junlin Li & Ping Zhang & Yuan Huang & Lian Li, 2024. "Cell surface patching via CXCR4-targeted nanothreads for cancer metastasis inhibition," Nature Communications, Nature, vol. 15(1), pages 1-21, December.
    10. Fatemeh Moodi & Amir Jahangard-Rafsanjani & Sajad Zarifzadeh, 2023. "Feature selection and regression methods for stock price prediction using technical indicators," Papers 2310.09903, arXiv.org, revised Nov 2023.
    11. Cappozzo, Andrea & Greselin, Francesca & Murphy, Thomas Brendan, 2021. "Robust variable selection for model-based learning in presence of adulteration," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
    12. Florian Pargent & Florian Pfisterer & Janek Thomas & Bernd Bischl, 2022. "Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features," Computational Statistics, Springer, vol. 37(5), pages 2671-2692, November.
    13. Tang, Wenjun & Wang, Hao & Lee, Xian-Long & Yang, Hong-Tzer, 2022. "Machine learning approach to uncovering residential energy consumption patterns based on socioeconomic and smart meter data," Energy, Elsevier, vol. 240(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0308531. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.