IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i20p2627-d658917.html
   My bibliography  Save this article

An Intelligent Metaheuristic Binary Pigeon Optimization-Based Feature Selection and Big Data Classification in a MapReduce Environment

Author

Listed:
  • Felwa Abukhodair

    (Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia)

  • Wafaa Alsaggaf

    (Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia)

  • Amani Tariq Jamal

    (Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia)

  • Sayed Abdel-Khalek

    (Department of Mathematics and Statistics, College of Science, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
    Department of Mathematics, Faculty of Science, Sohag University, Sohag 82524, Egypt)

  • Romany F. Mansour

    (Department of Mathematics, Faculty of Science, New Valley University, El-Kharga 72511, Egypt)

Abstract

Big Data are highly effective for systematically extracting and analyzing massive data. It can be useful to manage data proficiently over the conventional data handling approaches. Recently, several schemes have been developed for handling big datasets with several features. At the same time, feature selection (FS) methodologies intend to eliminate repetitive, noisy, and unwanted features that degrade the classifier results. Since conventional methods have failed to attain scalability under massive data, the design of new Big Data classification models is essential. In this aspect, this study focuses on the design of metaheuristic optimization based on big data classification in a MapReduce (MOBDC-MR) environment. The MOBDC-MR technique aims to choose optimal features and effectively classify big data. In addition, the MOBDC-MR technique involves the design of a binary pigeon optimization algorithm (BPOA)-based FS technique to reduce the complexity and increase the accuracy. Beetle antenna search (BAS) with long short-term memory (LSTM) model is employed for big data classification. The presented MOBDC-MR technique has been realized on Hadoop with the MapReduce programming model. The effective performance of the MOBDC-MR technique was validated using a benchmark dataset and the results were investigated under several measures. The MOBDC-MR technique demonstrated promising performance over the other existing techniques under different dimensions.

Suggested Citation

  • Felwa Abukhodair & Wafaa Alsaggaf & Amani Tariq Jamal & Sayed Abdel-Khalek & Romany F. Mansour, 2021. "An Intelligent Metaheuristic Binary Pigeon Optimization-Based Feature Selection and Big Data Classification in a MapReduce Environment," Mathematics, MDPI, vol. 9(20), pages 1-14, October.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:20:p:2627-:d:658917
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/20/2627/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/20/2627/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Vivien Marx, 2013. "The big challenges of big data," Nature, Nature, vol. 498(7453), pages 255-260, June.
    2. Harish Garg & Muhammad Riaz & Muhammad Abdullah Khokhar & Maryam Saba, 2021. "Correlation Measures for Cubic m-Polar Fuzzy Sets with Applications," Mathematical Problems in Engineering, Hindawi, vol. 2021, pages 1-19, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Nahla Mohammed Elzein & Mazlina Abdul Majid & Ibrahim Abaker Targio Hashem & Ashraf Osman Ibrahim & Anas W. Abulfaraj & Faisal Binzagr, 2023. "JQPro:Join Query Processing in a Distributed System for Big RDF Data Using the Hash-Merge Join Technique," Mathematics, MDPI, vol. 11(5), pages 1-20, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lin Zhu & Xiantao Liu & Sha He & Jun Shi & Ming Pang, 2015. "Keywords co-occurrence mapping knowledge domain research base on the theory of Big Data in oil and gas industry," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(1), pages 249-260, October.
    2. Zhang, Yi & Huang, Ying & Porter, Alan L. & Zhang, Guangquan & Lu, Jie, 2019. "Discovering and forecasting interactions in big data research: A learning-enhanced bibliometric study," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 795-807.
    3. Stefano Bianchini & Moritz Müller & Pierre Pelletier, 2022. "Artificial intelligence in science: An emerging general method of invention," Post-Print hal-03958025, HAL.
    4. Jun Feng & Zhenting Li & Shizhen Zhang & Chun Bao & Jingxian Fang & Yun Yin & Bolei Chen & Lei Pan & Bing Wang & Yu Zheng, 2023. "A Microimage-Processing-Based Technique for Detecting Qualitative and Quantitative Characteristics of Plant Cells," Agriculture, MDPI, vol. 13(9), pages 1-16, September.
    5. Tang, Ming & Liao, Huchang, 2021. "From conventional group decision making to large-scale group decision making: What are the challenges and how to meet them in big data era? A state-of-the-art survey," Omega, Elsevier, vol. 100(C).
    6. Janssen, Marijn & van der Voort, Haiko & Wahyudi, Agung, 2017. "Factors influencing big data decision-making quality," Journal of Business Research, Elsevier, vol. 70(C), pages 338-345.
    7. Haitham Nobanee & Mehroz Nida Dilshad & Mona Al Dhanhani & Maitha Al Neyadi & Sultan Al Qubaisi & Saeed Al Shamsi, 2021. "Big Data Applications the Banking Sector: A Bibliometric Analysis Approach," SAGE Open, , vol. 11(4), pages 21582440211, December.
    8. Reza Farrahi Moghaddam & Fereydoun Farrahi Moghaddam & Mohamed Cheriet, 2014. "A Multi-Entity Input Output (MEIO) Approach to Sustainability - Water-Energy-GHG (WEG) Footprint Statements in Use Cases from Auto and Telco Industries," Papers 1404.6227, arXiv.org, revised Apr 2014.
    9. Iftikhar Ul Haq & Tanzeela Shaheen & Wajid Ali & Hamza Toor & Tapan Senapati & Francesco Pilla & Sarbast Moslem, 2023. "Novel Fermatean Fuzzy Aczel–Alsina Model for Investment Strategy Selection," Mathematics, MDPI, vol. 11(14), pages 1-23, July.
    10. Yoshiyuki Ogata & Kazuto Mannen & Yasuto Kotani & Naohiro Kimura & Nozomu Sakurai & Daisuke Shibata & Hideyuki Suzuki, 2018. "ConfeitoGUI: A toolkit for size-sensitive community detection from a correlation network," PLOS ONE, Public Library of Science, vol. 13(10), pages 1-18, October.
    11. S. Vijayakumar Bharathi, 2017. "Prioritizing and Ranking the Big Data Information Security Risk Spectrum," Global Journal of Flexible Systems Management, Springer;Global Institute of Flexible Systems Management, vol. 18(3), pages 183-201, September.
    12. Jonathan E Butner & Ascher K Munion & Brian R W Baucom & Alexander Wong, 2019. "Ghost hunting in the nonlinear dynamic machine," PLOS ONE, Public Library of Science, vol. 14(12), pages 1-21, December.
    13. Subhroshekhar Ghosh & Soumendu Sundar Mukherjee, 2022. "Learning with latent group sparsity via heat flow dynamics on networks," Papers 2201.08326, arXiv.org.
    14. J. Lars Kirkby & Dang H. Nguyen & Duy Nguyen & Nhu N. Nguyen, 2022. "Inversion-free subsampling Newton’s method for large sample logistic regression," Statistical Papers, Springer, vol. 63(3), pages 943-963, June.
    15. Zbysław Dobrowolski, 2021. "Internet of Things and Other E-Solutions in Supply Chain Management May Generate Threats in the Energy Sector—The Quest for Preventive Measures," Energies, MDPI, vol. 14(17), pages 1-11, August.
    16. Dawen Xia & Xiaonan Lu & Huaqing Li & Wendong Wang & Yantao Li & Zili Zhang, 2018. "A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data," Complexity, Hindawi, vol. 2018, pages 1-16, January.
    17. Matteo Fontana & Massimo Tavoni & Simone Vantini, 2019. "Functional Data Analysis of high-frequency load curves reveals drivers of residential electricity consumption," PLOS ONE, Public Library of Science, vol. 14(6), pages 1-16, June.
    18. Lu Jiang & Xinyu Kang & Shan Huang & Bo Yang, 2022. "A refinement strategy for identification of scientific software from bioinformatics publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(6), pages 3293-3316, June.
    19. Gamermann, Daniel & Antunes, Felipe Leite, 2018. "Statistical analysis of Brazilian electoral campaigns via Benford’s law," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 496(C), pages 171-188.
    20. Alberto Fernández & Sara Río & Abdullah Bawakid & Francisco Herrera, 2017. "Fuzzy rule based classification systems for big data with MapReduce: granularity analysis," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(4), pages 711-730, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:20:p:2627-:d:658917. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.