IDEAS home Printed from https://ideas.repec.org/a/spr/jcomop/v49y2025i2d10.1007_s10878-025-01259-6.html
   My bibliography  Save this article

Embedded-filter ACO using clustering based mutual information for feature selection

Author

Listed:
  • S. Kumar Reddy Mallidi

    (Jawaharlal Nehru Technological University Kakinada
    Sri Vasavi Engineering College)

  • Rajeswara Rao Ramisetty

    (Jawaharlal Nehru Technological University Gurajada)

Abstract

The performance of machine learning algorithms is significantly influenced by the quality of the underlying dataset, which often comprises a mix of essential and redundant features. Feature selection, which identifies and discards these redundant features, plays a pivotal role in reducing computational and storage overheads. Current methodologies for this task primarily span filter-based and wrapper-based techniques. While Ant Colony Optimization, a popular bio-inspired meta-heuristic technique, has been extensively used for feature selection, employing mutual information as a principal heuristic measure, traditional mutual information is primarily suited for categorical features. To address this limitation, this study introduces an Embedded-Filter Ant Colony Optimization feature selection strategy that incorporates Clustering-Based Mutual Information. This integration offers enhanced support for classification tasks involving continuous features. To validate the efficiency of the proposed approach, various datasets were used, and a diverse range of machine learning algorithms were employed to evaluate the derived feature subsets. In addition to comparing the proposed method with Grey Wolf Optimization and Cuckoo Search Optimization-based feature selection approaches, a comprehensive evaluation was also carried out against established Ant Colony Optimization wrapper techniques. Experimental results indicate that the proposed Embedded-Filter Ant Colony Optimization consistently selects the minimal yet most relevant feature set while largely maintaining the efficacy of machine learning algorithms.

Suggested Citation

  • S. Kumar Reddy Mallidi & Rajeswara Rao Ramisetty, 2025. "Embedded-filter ACO using clustering based mutual information for feature selection," Journal of Combinatorial Optimization, Springer, vol. 49(2), pages 1-30, March.
  • Handle: RePEc:spr:jcomop:v:49:y:2025:i:2:d:10.1007_s10878-025-01259-6
    DOI: 10.1007/s10878-025-01259-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10878-025-01259-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10878-025-01259-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ravi Kiran Varma Penmatsa & Akhila Kalidindi & S. Kumar Reddy Mallidi, 2020. "Feature Reduction and Optimization of Malware Detection System Using Ant Colony Optimization and Rough Sets," International Journal of Information Security and Privacy (IJISP), IGI Global, vol. 14(3), pages 95-114, July.
    2. Brian C Ross, 2014. "Mutual Information between Discrete and Continuous Data Sets," PLOS ONE, Public Library of Science, vol. 9(2), pages 1-5, February.
    3. Zhun Cheng & Zhixiong Lu, 2018. "A Novel Efficient Feature Dimensionality Reduction Method and Its Application in Engineering," Complexity, Hindawi, vol. 2018, pages 1-14, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. repec:iim:iimawp:14638 is not listed on IDEAS
    2. María Isabel Arango & Edier Aristizábal & Federico Gómez, 2021. "Morphometrical analysis of torrential flows-prone catchments in tropical and mountainous terrain of the Colombian Andes by machine learning techniques," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 105(1), pages 983-1012, January.
    3. Xiaobo Yang & Zhilong Mi & Qingcai He & Binghui Guo & Zhiming Zheng, 2023. "Identification of Vital Genes for NSCLC Integrating Mutual Information and Synergy," Mathematics, MDPI, vol. 11(6), pages 1-15, March.
    4. Alec S. Dyer & MacKenzie Mark-Moser & Rodrigo Duran & Jennifer R. Bauer, 2024. "Offshore application of landslide susceptibility mapping using gradient-boosted decision trees: a Gulf of Mexico case study," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 120(7), pages 6223-6244, May.
    5. Ahmadi, Arman & Kazemi, Mohammad Hossein & Daccache, Andre & Snyder, Richard L., 2024. "SolarET: A generalizable machine learning approach to estimate reference evapotranspiration from solar radiation," Agricultural Water Management, Elsevier, vol. 295(C).
    6. Trizoglou, Pavlos & Liu, Xiaolei & Lin, Zi, 2021. "Fault detection by an ensemble framework of Extreme Gradient Boosting (XGBoost) in the operation of offshore wind turbines," Renewable Energy, Elsevier, vol. 179(C), pages 945-962.
    7. Lunacek, Monte & Williams, Lindy & Severino, Joseph & Ficenec, Karen & Ugirumurera, Juliette & Eash, Matthew & Ge, Yanbo & Phillips, Caleb, 2021. "A data-driven operational model for traffic at the Dallas Fort Worth International Airport," Journal of Air Transport Management, Elsevier, vol. 94(C).
    8. Philip Cammin & Jingjing Yu & Stefan Voß, 2023. "Tiered prediction models for port vessel emissions inventories," Flexible Services and Manufacturing Journal, Springer, vol. 35(1), pages 142-169, March.
    9. Ao Kong & Robert Azencott & Hongliang Zhu & Xindan Li, 2020. "Pattern recognition in micro-trading behaviors before stock price jumps: A framework based on multivariate time series analysis," Papers 2011.04939, arXiv.org, revised Feb 2021.
    10. Wei, Yupeng & Wu, Dazhong, 2023. "Prediction of state of health and remaining useful life of lithium-ion battery using graph convolutional network with dual attention mechanisms," Reliability Engineering and System Safety, Elsevier, vol. 230(C).
    11. Wang, Weicheng & Chen, Jinglong & Zhang, Tianci & Liu, Zijun & Wang, Jun & Zhang, Xinwei & He, Shuilong, 2023. "An asymmetrical graph Siamese network for one-classanomaly detection of engine equipment with multi-source fusion," Reliability Engineering and System Safety, Elsevier, vol. 235(C).
    12. Zhun Cheng & Zhixiong Lu, 2022. "Regression-Based Correction and I-PSO-Based Optimization of HMCVT’s Speed Regulating Characteristics for Agricultural Machinery," Agriculture, MDPI, vol. 12(5), pages 1-18, April.
    13. Xin Dang & Dao Nguyen & Yixin Chen & Junying Zhang, 2021. "A new Gini correlation between quantitative and qualitative variables," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(4), pages 1314-1343, December.
    14. Noemí DeCastro-García & Ángel Luis Muñoz Castañeda & David Escudero García & Miguel V. Carriegos, 2019. "Effect of the Sampling of a Dataset in the Hyperparameter Optimization Phase over the Efficiency of a Machine Learning Algorithm," Complexity, Hindawi, vol. 2019, pages 1-16, February.
    15. Banerjee, Ameet Kumar & Dionisio, Andreia & Pradhan, H.K. & Mahapatra, Biplab, 2021. "Hunting the quicksilver: Using textual news and causality analysis to predict market volatility," International Review of Financial Analysis, Elsevier, vol. 77(C).
    16. Zhun Cheng & Huadong Zhou & Zhixiong Lu, 2022. "A Novel 10-Parameter Motor Efficiency Model Based on I-SA and Its Comparative Application of Energy Utilization Efficiency in Different Driving Modes for Electric Tractor," Agriculture, MDPI, vol. 12(3), pages 1-20, March.
    17. Md Fahim Anjum & Clay Smyth & Rafael Zuzuárregui & Derk Jan Dijk & Philip A. Starr & Timothy Denison & Simon Little, 2024. "Multi-night cortico-basal recordings reveal mechanisms of NREM slow-wave suppression and spontaneous awakenings in Parkinson’s disease," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    18. Cheng, Zhun, 2023. "High nonlinearity of BEV's stepped automatic transmission design objectives and its optimal solution by a novel ISA-RSA," Energy, Elsevier, vol. 282(C).
    19. Hasan T Abbas & Lejla Alic & Madhav Erraguntla & Jim X Ji & Muhammad Abdul-Ghani & Qammer H Abbasi & Marwa K Qaraqe, 2019. "Predicting long-term type 2 diabetes with support vector machine using oral glucose tolerance test," PLOS ONE, Public Library of Science, vol. 14(12), pages 1-11, December.
    20. Ao Kong & Robert Azencott & Hongliang Zhu & Xindan Li, 2024. "Pattern Recognition in Microtrading Behaviors Preceding Stock Price Jumps: A Study Based on Mutual Information for Multivariate Time Series," Computational Economics, Springer;Society for Computational Economics, vol. 63(4), pages 1401-1429, April.
    21. Yu-Wen Chen & Yi-Chun Li & Chien-Yu Huang & Chia-Jung Lin & Chia-Jui Tien & Wen-Shiang Chen & Chia-Ling Chen & Keh-Chung Lin, 2023. "Predicting Arm Nonuse in Individuals with Good Arm Motor Function after Stroke Rehabilitation: A Machine Learning Study," IJERPH, MDPI, vol. 20(5), pages 1-12, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jcomop:v:49:y:2025:i:2:d:10.1007_s10878-025-01259-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.