IDEAS home Printed from https://ideas.repec.org/a/wsi/jikmxx/v15y2016i04ns0219649216500386.html
   My bibliography  Save this article

Improving the Performance of Data Mining by Using Big Data in Cloud Environment

Author

Listed:
  • Djilali Dahmani

    (Department of Mathematics and Computer Science, University of Sciences and Technology-Mohammed Boudiaf USTO, Oran, Algeria)

  • Sid Ahmed Rahal

    (Department of Mathematics and Computer Science, University of Sciences and Technology-Mohammed Boudiaf USTO, Oran, Algeria)

  • Ghalem Belalem

    (Department of Computer Science, Faculty of Exact and Applied Sciences, University of Oran 1, Ahmed Ben Bella, Oran, Algeria)

Abstract

The volume of business data is increasing very quickly, most of these data are relational. The need to extract knowledge with Data Mining requires keeping all historical data. This complicates more and more the processing and storage of data, and requires further power and capacity which surpass the ability of any machine. So, using distributed environments like cloud computing becomes very useful to share storage and processing between multiple nodes. Unfortunately, data based on relational model cannot be easily used in cloud because of its rigidity and elasticity in such environments. To solve this issue, new big data systems appear such as NoSQL that make data easier to share and distribute in cloud environments. So, this is theoretically beneficial for data mining use case. However, in practice we need to prove it by evaluating performance for both multi-nodes NoSQL and mono-node relational. Also, in case of cloud, it is very interesting to know if performance is still proportionally increasing according to the number of nodes, and if there is an optimum number of nodes in which performance becomes nearly steady or starts dropping off. Motivated by this topic, we propose in this paper an approach to migrate relational data to an appropriate NoSQL system in cloud environment, and then evaluate their performance to capture some interesting results for Data mining. As experimentation, we use industrial data deployed in a data mining process of an oil and gas company. After migrating these data, we perform some experiments to compare and evaluate storage, processing and execution time. As objective, we verify data elasticity, run time performance, and try to find the optimum number of nodes.

Suggested Citation

  • Djilali Dahmani & Sid Ahmed Rahal & Ghalem Belalem, 2016. "Improving the Performance of Data Mining by Using Big Data in Cloud Environment," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 15(04), pages 1-18, December.
  • Handle: RePEc:wsi:jikmxx:v:15:y:2016:i:04:n:s0219649216500386
    DOI: 10.1142/S0219649216500386
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0219649216500386
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0219649216500386?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Bibi Alajmi & Talal Alhaji, 2018. "Mapping the Field of Knowledge Management: Bibliometric and Content Analysis of Journal of Information & Knowledge Management for the Period from 2002–2016," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 17(03), pages 1-16, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:jikmxx:v:15:y:2016:i:04:n:s0219649216500386. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/jikm/jikm.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.