IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0274850.html
   My bibliography  Save this article

Binary dwarf mongoose optimizer for solving high-dimensional feature selection problems

Author

Listed:
  • Olatunji A Akinola
  • Jeffrey O Agushaka
  • Absalom E Ezugwu

Abstract

Selecting appropriate feature subsets is a vital task in machine learning. Its main goal is to remove noisy, irrelevant, and redundant feature subsets that could negatively impact the learning model’s accuracy and improve classification performance without information loss. Therefore, more advanced optimization methods have been employed to locate the optimal subset of features. This paper presents a binary version of the dwarf mongoose optimization called the BDMO algorithm to solve the high-dimensional feature selection problem. The effectiveness of this approach was validated using 18 high-dimensional datasets from the Arizona State University feature selection repository and compared the efficacy of the BDMO with other well-known feature selection techniques in the literature. The results show that the BDMO outperforms other methods producing the least average fitness value in 14 out of 18 datasets which means that it achieved 77.77% on the overall best fitness values. The result also shows BDMO demonstrating stability by returning the least standard deviation (SD) value in 13 of 18 datasets (72.22%). Furthermore, the study achieved higher validation accuracy in 15 of the 18 datasets (83.33%) over other methods. The proposed approach also yielded the highest validation accuracy attainable in the COIL20 and Leukemia datasets which vividly portray the superiority of the BDMO.

Suggested Citation

  • Olatunji A Akinola & Jeffrey O Agushaka & Absalom E Ezugwu, 2022. "Binary dwarf mongoose optimizer for solving high-dimensional feature selection problems," PLOS ONE, Public Library of Science, vol. 17(10), pages 1-26, October.
  • Handle: RePEc:plo:pone00:0274850
    DOI: 10.1371/journal.pone.0274850
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0274850
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0274850&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0274850?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Unler, Alper & Murat, Alper, 2010. "A discrete particle swarm optimization method for feature selection in binary classification problems," European Journal of Operational Research, Elsevier, vol. 206(3), pages 528-539, November.
    2. Taiyong Li & Zijie Qian & Ting He, 2020. "Short-Term Load Forecasting with Improved CEEMDAN and GWO-Based Multiple Kernel ELM," Complexity, Hindawi, vol. 2020, pages 1-20, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Olaide N Oyelade & Jeffrey O Agushaka & Absalom E Ezugwu, 2023. "Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets," PLOS ONE, Public Library of Science, vol. 18(3), pages 1-36, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yu, Shiwei & Wei, Yi-Ming & Fan, Jingli & Zhang, Xian & Wang, Ke, 2012. "Exploring the regional characteristics of inter-provincial CO2 emissions in China: An improved fuzzy clustering analysis based on particle swarm optimization," Applied Energy, Elsevier, vol. 92(C), pages 552-562.
    2. Wen, Hanguan & Liu, Xiufeng & Yang, Ming & Lei, Bo & Xu, Cheng & Chen, Zhe, 2024. "A novel approach for identifying customer groups for personalized demand-side management services using household socio-demographic data," Energy, Elsevier, vol. 286(C).
    3. Moraes, Marcelo Botelho da Costa & Nagano, Marcelo Seido, 2014. "Evolutionary models in cash management policies with multiple assets," Economic Modelling, Elsevier, vol. 39(C), pages 1-7.
    4. Lee, In Gyu & Yoon, Sang Won & Won, Daehan, 2022. "A Mixed Integer Linear Programming Support Vector Machine for Cost-Effective Group Feature Selection: Branch-Cut-and-Price Approach," European Journal of Operational Research, Elsevier, vol. 299(3), pages 1055-1068.
    5. Mohammad Mahdi Mousavi & Jamal Ouenniche & Kaoru Tone, 2023. "A dynamic performance evaluation of distress prediction models," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(4), pages 756-784, July.
    6. Wang, Xin & Liu, Xiaodong & Pedrycz, Witold & Zhu, Xiaolei & Hu, Guangfei, 2012. "Mining axiomatic fuzzy set association rules for classification problems," European Journal of Operational Research, Elsevier, vol. 218(1), pages 202-210.
    7. Pendharkar, Parag C. & Troutt, Marvin D., 2011. "DEA based dimensionality reduction for classification problems satisfying strict non-satiety assumption," European Journal of Operational Research, Elsevier, vol. 212(1), pages 155-163, July.
    8. Yi, Tao & Cheng, Xiaobin & Peng, Peng, 2022. "Two-stage optimal allocation of charging stations based on spatiotemporal complementarity and demand response: A framework based on MCS and DBPSO," Energy, Elsevier, vol. 239(PC).
    9. Alireza Pourdaryaei & Mohammad Mohammadi & Mazaher Karimi & Hazlie Mokhlis & Hazlee A. Illias & Seyed Hamidreza Aghay Kaboli & Shameem Ahmad, 2021. "Recent Development in Electricity Price Forecasting Based on Computational Intelligence Techniques in Deregulated Power Market," Energies, MDPI, vol. 14(19), pages 1-28, September.
    10. Panagopoulos, Orestis P. & Pappu, Vijay & Xanthopoulos, Petros & Pardalos, Panos M., 2016. "Constrained subspace classifier for high dimensional datasets," Omega, Elsevier, vol. 59(PA), pages 40-46.
    11. Bin, Wei & Qinke, Peng & Jing, Zhao & Xiao, Chen, 2012. "A binary particle swarm optimization algorithm inspired by multi-level organizational learning behavior," European Journal of Operational Research, Elsevier, vol. 219(2), pages 224-233.
    12. Zouache, Djaafar & Moussaoui, Abdelouahab & Ben Abdelaziz, Fouad, 2018. "A cooperative swarm intelligence algorithm for multi-objective discrete optimization with application to the knapsack problem," European Journal of Operational Research, Elsevier, vol. 264(1), pages 74-88.
    13. Yingyan Zhao & Yihong Zhou & Wu Deng, 2020. "Innovation Mode and Optimization Strategy of B2C E-Commerce Logistics Distribution under Big Data," Sustainability, MDPI, vol. 12(8), pages 1-13, April.
    14. Fouskakis, D., 2012. "Bayesian variable selection in generalized linear models using a combination of stochastic optimization methods," European Journal of Operational Research, Elsevier, vol. 220(2), pages 414-422.
    15. Huang, Yuming & Ge, Bingfeng & Hipel, Keith W. & Fang, Liping & Zhao, Bin & Yang, Kewei, 2023. "Solving the inverse graph model for conflict resolution using a hybrid metaheuristic algorithm," European Journal of Operational Research, Elsevier, vol. 305(2), pages 806-819.
    16. Jin, Xuejun & Zhu, Keer & Yang, Xiaolan & Wang, Shouyang, 2021. "Estimating the reaction of Bitcoin prices to the uncertainty of fiat currency," Research in International Business and Finance, Elsevier, vol. 58(C).
    17. Toshiki Sato & Yuichi Takano & Ryuhei Miyashiro & Akiko Yoshise, 2016. "Feature subset selection for logistic regression via mixed integer optimization," Computational Optimization and Applications, Springer, vol. 64(3), pages 865-880, July.
    18. Bossman, Ahmed & Umar, Zaghum & Agyei, Samuel Kwaku & Junior, Peterson Owusu, 2022. "A new ICEEMDAN-based transfer entropy quantifying information flow between real estate and policy uncertainty," Research in Economics, Elsevier, vol. 76(3), pages 189-205.
    19. Wang, Lizhi & Nikouei Mehr, Maryam, 2019. "An optimization approach to epistasis detection," European Journal of Operational Research, Elsevier, vol. 274(3), pages 1069-1076.
    20. Li, An-Da & He, Zhen & Wang, Qing & Zhang, Yang, 2019. "Key quality characteristics selection for imbalanced production data using a two-phase bi-objective feature selection method," European Journal of Operational Research, Elsevier, vol. 274(3), pages 978-989.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0274850. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.