IDEAS home Printed from https://ideas.repec.org/a/spr/infosf/v19y2017i5d10.1007_s10796-016-9680-8.html
   My bibliography  Save this article

Automatic classification of data-warehouse-data for information lifecycle management using machine learning techniques

Author

Listed:
  • Sebastian Büsch

    (Ilmenau University of Technology)

  • Volker Nissen

    (Ilmenau University of Technology)

  • Arndt Wünscher

    (Ilmenau University of Technology)

Abstract

The aim of Information Lifecycle Management (ILM) is to govern data throughout its lifecycle as efficiently as possible and effectively from technical points of view. A core aspect is the question, where the data should be stored, since different costs and access times are entailed. For this purpose data have to be classified, which presently is either done manually in an elaborate way, or with recourse to only a few data attributes, in particular access frequency. In the context of Data-Warehouse-Systems this article introduces an automated and therefore speedy and cost-effective data classification for ILM. Machine learning techniques, in particular an artificial neural network (multilayer perceptron), a support vector machine and a decision tree approach are compared on an SAP-based real-world data set from the automotive industry. This data classification considers a large number of data attributes and thus attains similar results akin to human experts. In this comparison of machine learning techniques, besides the accuracy of classification, also the types of misclassification that appear, are included, since this is important in ILM.

Suggested Citation

  • Sebastian Büsch & Volker Nissen & Arndt Wünscher, 2017. "Automatic classification of data-warehouse-data for information lifecycle management using machine learning techniques," Information Systems Frontiers, Springer, vol. 19(5), pages 1085-1099, October.
  • Handle: RePEc:spr:infosf:v:19:y:2017:i:5:d:10.1007_s10796-016-9680-8
    DOI: 10.1007/s10796-016-9680-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10796-016-9680-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10796-016-9680-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. David L. Olson & Dursun Delen, 2008. "Advanced Data Mining Techniques," Springer Books, Springer, number 978-3-540-76917-0, June.
    2. Hasso Plattner & Alexander Zeier, 2011. "In-Memory Data Management," Springer Books, Springer, number 978-3-642-19363-7, June.
    3. Markus Lilienthal, 2013. "A Decision Support Model for Cloud Bursting," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 5(2), pages 71-81, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Vijayan Sugumaran & T. V. Geetha & D. Manjula & Hema Gopal, 2017. "Guest Editorial: Computational Intelligence and Applications," Information Systems Frontiers, Springer, vol. 19(5), pages 969-974, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sebastian Büsch & Volker Nissen & Arndt Wünscher, 0. "Automatic classification of data-warehouse-data for information lifecycle management using machine learning techniques," Information Systems Frontiers, Springer, vol. 0, pages 1-15.
    2. Tobias Knabke & Sebastian Olbrich, 2018. "Building novel capabilities to enable business intelligence agility: results from a quantitative study," Information Systems and e-Business Management, Springer, vol. 16(3), pages 493-546, August.
    3. Mark Gilchrist & Deana Lehmann Mooers & Glenn Skrubbeltrang & Francine Vachon, 2012. "Knowledge Discovery in Databases for Competitive Advantage," Journal of Management and Strategy, Journal of Management and Strategy, Sciedu Press, vol. 3(2), pages 2-15, April.
    4. Marina Johnson & Abdullah Albizri & Serhat Simsek, 2022. "Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis," Annals of Operations Research, Springer, vol. 308(1), pages 275-305, January.
    5. Simsek, Serhat & Dag, Ali & Tiahrt, Thomas & Oztekin, Asil, 2021. "A Bayesian Belief Network-based probabilistic mechanism to determine patient no-show risk categories," Omega, Elsevier, vol. 100(C).
    6. Yucel, Ahmet & Dag, Ali & Oztekin, Asil & Carpenter, Mark, 2022. "A novel text analytic methodology for classification of product and service reviews," Journal of Business Research, Elsevier, vol. 151(C), pages 287-297.
    7. Kizilaslan, Recep & Freund, Steven & Iseri, Ali, 2016. "A data analytic approach to forecasting daily stock returns in an emerging marketAuthor-Name: Oztekin, Asil," European Journal of Operational Research, Elsevier, vol. 253(3), pages 697-710.
    8. Saljooghi, Saeed & Safisamghabadib, Azamdokht, 2016. "Analyzing Semiconductor component's market sales data to create an Expert Fuzzy inference system," MPRA Paper 79846, University Library of Munich, Germany.
    9. Ramin Vakili & Mojdeh Khorsand, 2022. "A Machine Learning-Based Method for Identifying Critical Distance Relays for Transient Stability Studies," Energies, MDPI, vol. 15(23), pages 1-28, November.
    10. Delen, Dursun & Cogdell, Douglas & Kasap, Nihat, 2012. "A comparative analysis of data mining methods in predicting NCAA bowl outcomes," International Journal of Forecasting, Elsevier, vol. 28(2), pages 543-552.
    11. Chen, Kunlong & Zheng, Fangdan & Jiang, Jiuchun & Zhang, Weige & Jiang, Yan & Chen, Kunjin, 2017. "Practical failure recognition model of lithium-ion batteries based on partial charging process," Energy, Elsevier, vol. 138(C), pages 1199-1208.
    12. Martin Kowalczyk & Peter Buxmann, 2014. "Big Data and Information Processing in Organizational Decision Processes," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 6(5), pages 267-278, October.
    13. Emrouznejad, Ali & De Witte, Kristof, 2010. "COOPER-framework: A unified process for non-parametric projects," European Journal of Operational Research, Elsevier, vol. 207(3), pages 1573-1586, December.
    14. Shaheen, Muhammad & Khan, Muhammad Zeb, 2016. "A method of data mining for selection of site for wind turbines," Renewable and Sustainable Energy Reviews, Elsevier, vol. 55(C), pages 1225-1233.
    15. Peter Loos & Jens Lechtenbörger & Gottfried Vossen & Alexander Zeier & Jens Krüger & Jürgen Müller & Wolfgang Lehner & Donald Kossmann & Benjamin Fabian & Oliver Günther & Robert Winter, 2011. "In-memory Databases in Business Information Systems," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 3(6), pages 389-395, December.
    16. Asil Oztekin, 2018. "Information fusion-based meta-classification predictive modeling for ETF performance," Information Systems Frontiers, Springer, vol. 20(2), pages 223-238, April.
    17. Abdorrahman Haeri, 2020. "Analyzing safety level and recognizing flaws of commercial centers through data mining approach," Journal of Risk and Reliability, , vol. 234(3), pages 512-526, June.
    18. Renhe Hu & Zihan Hui & Yifan Li & Jueqi Guan, 2023. "Research on Learning Concentration Recognition with Multi-Modal Features in Virtual Reality Environments," Sustainability, MDPI, vol. 15(15), pages 1-16, July.
    19. Kazim Topuz & Hasmet Uner & Asil Oztekin & Mehmet Bayram Yildirim, 2018. "Predicting pediatric clinic no-shows: a decision analytic framework using elastic net and Bayesian belief network," Annals of Operations Research, Springer, vol. 263(1), pages 479-499, April.
    20. Mehri, Ali & Darooneh, Amir H., 2011. "The role of entropy in word ranking," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 390(18), pages 3157-3163.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:infosf:v:19:y:2017:i:5:d:10.1007_s10796-016-9680-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.