IDEAS home Printed from https://ideas.repec.org/a/spr/advdac/v11y2017i4d10.1007_s11634-016-0260-z.html
   My bibliography  Save this article

Fuzzy rule based classification systems for big data with MapReduce: granularity analysis

Author

Listed:
  • Alberto Fernández

    (University of Granada)

  • Sara Río

    (University of Granada)

  • Abdullah Bawakid

    (King Abdulaziz University (KAU))

  • Francisco Herrera

    (University of Granada
    King Abdulaziz University (KAU))

Abstract

Due to the vast amount of information available nowadays, and the advantages related to the processing of this data, the topics of big data and data science have acquired a great importance in the current research. Big data applications are mainly about scalability, which can be achieved via the MapReduce programming model.It is designed to divide the data into several chunks or groups that are processed in parallel, and whose result is “assembled” to provide a single solution. Among different classification paradigms adapted to this new framework, fuzzy rule based classification systems have shown interesting results with a MapReduce approach for big data. It is well known that the performance of these types of systems has a strong dependence on the selection of a good granularity level for the Data Base. However, in the context of MapReduce this parameter is even harder to determine as it can be also related with the number of Maps chosen for the processing stage. In this paper, we aim at analyzing the interrelation between the number of labels of the fuzzy variables and the scarcity of the data due to the data sampling in MapReduce. Specifically, we consider that as the partitioning of the initial instance set grows, the level of granularity necessary to achieve a good performance also becomes higher. The experimental results, carried out for several Big Data problems, and using the Chi-FRBCS-BigData algorithms, support our claims.

Suggested Citation

  • Alberto Fernández & Sara Río & Abdullah Bawakid & Francisco Herrera, 2017. "Fuzzy rule based classification systems for big data with MapReduce: granularity analysis," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(4), pages 711-730, December.
  • Handle: RePEc:spr:advdac:v:11:y:2017:i:4:d:10.1007_s11634-016-0260-z
    DOI: 10.1007/s11634-016-0260-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11634-016-0260-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11634-016-0260-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Chris A. Mattmann, 2013. "A vision for data science," Nature, Nature, vol. 493(7433), pages 473-475, January.
    2. Caf, . "Programa de bosques," Books, CAF Development Bank Of Latinamerica, number 533.
    3. Vivien Marx, 2013. "The big challenges of big data," Nature, Nature, vol. 498(7453), pages 255-260, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lin Zhu & Xiantao Liu & Sha He & Jun Shi & Ming Pang, 2015. "Keywords co-occurrence mapping knowledge domain research base on the theory of Big Data in oil and gas industry," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(1), pages 249-260, October.
    2. Janssen, Marijn & van der Voort, Haiko & Wahyudi, Agung, 2017. "Factors influencing big data decision-making quality," Journal of Business Research, Elsevier, vol. 70(C), pages 338-345.
    3. Daphne R. Raban & Avishag Gordon, 2020. "The evolution of data science and big data research: A bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(3), pages 1563-1581, March.
    4. Jonathan E Butner & Ascher K Munion & Brian R W Baucom & Alexander Wong, 2019. "Ghost hunting in the nonlinear dynamic machine," PLOS ONE, Public Library of Science, vol. 14(12), pages 1-21, December.
    5. J. Lars Kirkby & Dang H. Nguyen & Duy Nguyen & Nhu N. Nguyen, 2022. "Inversion-free subsampling Newton’s method for large sample logistic regression," Statistical Papers, Springer, vol. 63(3), pages 943-963, June.
    6. Lu Jiang & Xinyu Kang & Shan Huang & Bo Yang, 2022. "A refinement strategy for identification of scientific software from bioinformatics publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(6), pages 3293-3316, June.
    7. Gamermann, Daniel & Antunes, Felipe Leite, 2018. "Statistical analysis of Brazilian electoral campaigns via Benford’s law," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 496(C), pages 171-188.
    8. Le Zhang & Chunqiu Zheng & Tian Li & Lei Xing & Han Zeng & Tingting Li & Huan Yang & Jia Cao & Badong Chen & Ziyuan Zhou, 2017. "Building Up a Robust Risk Mathematical Platform to Predict Colorectal Cancer," Complexity, Hindawi, vol. 2017, pages 1-14, October.
    9. Felwa Abukhodair & Wafaa Alsaggaf & Amani Tariq Jamal & Sayed Abdel-Khalek & Romany F. Mansour, 2021. "An Intelligent Metaheuristic Binary Pigeon Optimization-Based Feature Selection and Big Data Classification in a MapReduce Environment," Mathematics, MDPI, vol. 9(20), pages 1-14, October.
    10. Stefano Bianchini & Moritz Muller & Pierre Pelletier, 2020. "Deep Learning in Science," Papers 2009.01575, arXiv.org, revised Sep 2020.
    11. Chuang Lin & Guoliang Li & Zhiguang Shan & Yong Shi, 2017. "Thinking and Modeling for Big Data from the Perspective of the I Ching," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(06), pages 1451-1463, November.
    12. Bianchini, Stefano & Müller, Moritz & Pelletier, Pierre, 2022. "Artificial intelligence in science: An emerging general method of invention," Research Policy, Elsevier, vol. 51(10).
    13. Joyce de Souza Zanirato Maia & Ana Paula Arantes Bueno & Joao Ricardo Sato, 2023. "Applications of Artificial Intelligence Models in Educational Analytics and Decision Making: A Systematic Review," World, MDPI, vol. 4(2), pages 1-26, May.
    14. Zhang, Yi & Huang, Ying & Porter, Alan L. & Zhang, Guangquan & Lu, Jie, 2019. "Discovering and forecasting interactions in big data research: A learning-enhanced bibliometric study," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 795-807.
    15. Stefano Bianchini & Moritz Müller & Pierre Pelletier, 2022. "Artificial intelligence in science: An emerging general method of invention," Post-Print hal-03958025, HAL.
    16. Jun Feng & Zhenting Li & Shizhen Zhang & Chun Bao & Jingxian Fang & Yun Yin & Bolei Chen & Lei Pan & Bing Wang & Yu Zheng, 2023. "A Microimage-Processing-Based Technique for Detecting Qualitative and Quantitative Characteristics of Plant Cells," Agriculture, MDPI, vol. 13(9), pages 1-16, September.
    17. Tang, Ming & Liao, Huchang, 2021. "From conventional group decision making to large-scale group decision making: What are the challenges and how to meet them in big data era? A state-of-the-art survey," Omega, Elsevier, vol. 100(C).
    18. Haitham Nobanee & Mehroz Nida Dilshad & Mona Al Dhanhani & Maitha Al Neyadi & Sultan Al Qubaisi & Saeed Al Shamsi, 2021. "Big Data Applications the Banking Sector: A Bibliometric Analysis Approach," SAGE Open, , vol. 11(4), pages 21582440211, December.
    19. Reza Farrahi Moghaddam & Fereydoun Farrahi Moghaddam & Mohamed Cheriet, 2014. "A Multi-Entity Input Output (MEIO) Approach to Sustainability - Water-Energy-GHG (WEG) Footprint Statements in Use Cases from Auto and Telco Industries," Papers 1404.6227, arXiv.org, revised Apr 2014.
    20. Yoshiyuki Ogata & Kazuto Mannen & Yasuto Kotani & Naohiro Kimura & Nozomu Sakurai & Daisuke Shibata & Hideyuki Suzuki, 2018. "ConfeitoGUI: A toolkit for size-sensitive community detection from a correlation network," PLOS ONE, Public Library of Science, vol. 13(10), pages 1-18, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:advdac:v:11:y:2017:i:4:d:10.1007_s11634-016-0260-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.