Author
Listed:
- Sheema Gul
- Dost Muhammad Khan
- Saeed Aldahmani
- Zardad Khan
Abstract
High-dimensional gene expression data poses significant challenges for binary classification, particularly in the context of feature selection methods. Conventional methods, for example, Proportional Overlap Score, Wilcoxon Rank-Sum Test, Weighted Signal to Noise Ratio, ensemble Minimum Redundancy and Maximum Relevance, Fisher Score and Robust Weighted Score for unbalanced data are impacted by key challenges, such as, class imbalance and redundancy. To mitigate these issues, customized feature selection methods are required to tackle the class imbalance issue.This study proposes a more robust solution, Margin Weighted Robust Discriminant Score, for feature selection in the context of high dimensional imbalanced problems. MW-RDS integrates a minority amplification factor to ensure the impact of minority class observation during feature ranking process. The amplification factor along with class specific stability weights obtained from minority-focused robust discriminant score are used for achieving maximum differential capability of genes/features. The score is weighted by margin weights extracted from support vectors to enhance the discriminative power of genes/features thereby highlighting its potential for class separation. Finally, top-ranked genes/features are constrained using ℓ1-regularization to discard redundant genes while identifying the most significant ones.The performance of the proposed method is tested on 9 openly accessible gene expression datasets, using Random Forest, Support Vector Machines, and Weighted k Nearest Neighbors classifiers in term of performance metrics, i.e., accuracy, sensitivity, specificity, F1-score, and precision. The results reveal that the proposed method outperforms the existing methods in most of the cases. Boxplots and stability-plots are also generated to gain a deeper understanding of the results. To futher assess the efficacy of the proposed method, the paper also gives a detailed simulation study.
Suggested Citation
Sheema Gul & Dost Muhammad Khan & Saeed Aldahmani & Zardad Khan, 2025.
"Margin weighted robust discriminant score for feature selection in imbalanced gene expression classification,"
PLOS ONE, Public Library of Science, vol. 20(6), pages 1-28, June.
Handle:
RePEc:plo:pone00:0325147
DOI: 10.1371/journal.pone.0325147
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0325147. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.