IDEAS home Printed from https://ideas.repec.org/a/spr/elmark/v33y2023i1d10.1007_s12525-023-00677-w.html
   My bibliography  Save this article

Sanitizing data for analysis: Designing systems for data understanding

Author

Listed:
  • Joshua Holstein

    (Karlsruhe Institute of Technology Kaiserstraße)

  • Max Schemmer

    (Karlsruhe Institute of Technology Kaiserstraße)

  • Johannes Jakubik

    (Karlsruhe Institute of Technology Kaiserstraße)

  • Michael Vössing

    (Karlsruhe Institute of Technology Kaiserstraße)

  • Gerhard Satzger

    (Karlsruhe Institute of Technology Kaiserstraße)

Abstract

As organizations accumulate vast amounts of data for analysis, a significant challenge remains in fully understanding these datasets to extract accurate information and generate real-world impact. Particularly, the high dimensionality of datasets and the lack of sufficient documentation, specifically the provision of metadata, often limit the potential to exploit the full value of data via analytical methods. To address these issues, this study proposes a hybrid approach to metadata generation, that leverages both the in-depth knowledge of domain experts and the scalability of automated processes. The approach centers on two key design principles—semanticization and contextualization—to facilitate the understanding of high-dimensional datasets. A real-world case study conducted at a leading pharmaceutical company validates the effectiveness of this approach, demonstrating improved collaboration and knowledge sharing among users. By addressing the challenges in metadata generation, this research contributes significantly toward empowering organizations to make more effective, data-driven decisions.

Suggested Citation

  • Joshua Holstein & Max Schemmer & Johannes Jakubik & Michael Vössing & Gerhard Satzger, 2023. "Sanitizing data for analysis: Designing systems for data understanding," Electronic Markets, Springer;IIM University of St. Gallen, vol. 33(1), pages 1-18, December.
  • Handle: RePEc:spr:elmark:v:33:y:2023:i:1:d:10.1007_s12525-023-00677-w
    DOI: 10.1007/s12525-023-00677-w
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s12525-023-00677-w
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s12525-023-00677-w?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Christian Janiesch & Patrick Zschech & Kai Heinrich, 2021. "Machine learning and deep learning," Electronic Markets, Springer;IIM University of St. Gallen, vol. 31(3), pages 685-695, September.
    2. Benjamin M. Abdel-Karim & Nicolas Pfeuffer & Oliver Hinz, 2021. "Machine learning in information systems - a bibliographic review and open research issues," Electronic Markets, Springer;IIM University of St. Gallen, vol. 31(3), pages 643-670, September.
    3. John Venable & Jan Pries-Heje & Richard Baskerville, 2016. "FEDS: a Framework for Evaluation in Design Science Research," European Journal of Information Systems, Taylor & Francis Journals, vol. 25(1), pages 77-89, January.
    4. Bokrantz, Jon & Skoogh, Anders & Berlin, Cecilia & Wuest, Thorsten & Stahre, Johan, 2020. "Smart Maintenance: a research agenda for industrial maintenance management," International Journal of Production Economics, Elsevier, vol. 224(C).
    5. Bill Kuechler & Vijay Vaishnavi, 2008. "On theory development in design science research: anatomy of a research project," European Journal of Information Systems, Taylor & Francis Journals, vol. 17(5), pages 489-504, October.
    6. Daniel Fürstenau & Stefan Klein & Amyn Vogel & Carolin Auschra, 2021. "Multi-sided platform and data-driven care research," Electronic Markets, Springer;IIM University of St. Gallen, vol. 31(4), pages 811-828, December.
    7. Thong, J. Y. L. & Yap, C. S., 1995. "CEO characteristics, organizational characteristics and information technology adoption in small businesses," Omega, Elsevier, vol. 23(4), pages 429-442, August.
    8. Rainer Alt, 2021. "How to organize for AI? An interview with Yao-Hua Tan," Electronic Markets, Springer;IIM University of St. Gallen, vol. 31(3), pages 639-642, September.
    9. Christian Janiesch & Barbara Dinter & Patrick Mikalef & Olgerta Tona, 2022. "Business analytics and big data research in information systems," Journal of Business Analytics, Taylor & Francis Journals, vol. 5(1), pages 1-7, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Julius Peter Landwehr & Niklas Kühl & Jannis Walk & Mario Gnädig, 2022. "Design Knowledge for Deep-Learning-Enabled Image-Based Decision Support Systems," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 64(6), pages 707-728, December.
    2. Rainer Alt, 2021. "Electronic Markets on robotics," Electronic Markets, Springer;IIM University of St. Gallen, vol. 31(3), pages 465-471, September.
    3. Christian Engel & Philipp Ebel & Jan Marco Leimeister, 2022. "Cognitive automation," Electronic Markets, Springer;IIM University of St. Gallen, vol. 32(1), pages 339-350, March.
    4. Niklas Kühl & Max Schemmer & Marc Goutier & Gerhard Satzger, 2022. "Artificial intelligence and machine learning," Electronic Markets, Springer;IIM University of St. Gallen, vol. 32(4), pages 2235-2244, December.
    5. Jen-Yu Lee & Tien-Thinh Nguyen & Hong-Giang Nguyen & Jen-Yao Lee, 2022. "Towards Predictive Crude Oil Purchase: A Case Study in the USA and Europe," Energies, MDPI, vol. 15(11), pages 1-15, May.
    6. Eduard Hartwich & Alexander Rieger & Johannes Sedlmeir & Dominik Jurek & Gilbert Fridgen, 2023. "Machine economies," Electronic Markets, Springer;IIM University of St. Gallen, vol. 33(1), pages 1-13, December.
    7. Raffaele Fabio Ciriello & Alexandra Cecilie Gjøl Torbensen & Magnus Rotvit Perlt Hansen & Christoph Müller-Bloch, 2023. "Blockchain-based digital rights management systems: Design principles for the music industry," Electronic Markets, Springer;IIM University of St. Gallen, vol. 33(1), pages 1-21, December.
    8. Rui Ma & Jia Wang & Wei Zhao & Hongjie Guo & Dongnan Dai & Yuliang Yun & Li Li & Fengqi Hao & Jinqiang Bai & Dexin Ma, 2022. "Identification of Maize Seed Varieties Using MobileNetV2 with Improved Attention Mechanism CBAM," Agriculture, MDPI, vol. 13(1), pages 1-16, December.
    9. Jamil Paolo Francisco & Tristan Canare & Jean Rebecca Labios, 2018. "Obstacles and Enablers of Internationalization of Philippine SMEs Through Participation in Global Value Chains," Working Papers id:12905, eSocialSciences.
    10. Dylan Norbert Gono & Herlina Napitupulu & Firdaniza, 2023. "Silver Price Forecasting Using Extreme Gradient Boosting (XGBoost) Method," Mathematics, MDPI, vol. 11(18), pages 1-15, September.
    11. Cheng Yang & Fuhao Sun & Yujie Zou & Zhipeng Lv & Liang Xue & Chao Jiang & Shuangyu Liu & Bochao Zhao & Haoyang Cui, 2024. "A Survey of Photovoltaic Panel Overlay and Fault Detection Methods," Energies, MDPI, vol. 17(4), pages 1-37, February.
    12. Yunhee Kim & Jae Young Choi & Yeonbae Kim, 2011. "Complementarity and contextuality in the adoption of information systems," Applied Economics Letters, Taylor & Francis Journals, vol. 18(16), pages 1613-1618.
    13. Tom Lewandowski & Emir Kučević & Stephan Leible & Mathis Poser & Tilo Böhmann, 2023. "Enhancing conversational agents for successful operation: A multi-perspective evaluation approach for continuous improvement," Electronic Markets, Springer;IIM University of St. Gallen, vol. 33(1), pages 1-20, December.
    14. Hendrik Haße & Hendrik Valk & Frederik Möller & Boris Otto, 2022. "Design Principles for Shared Digital Twins in Distributed Systems," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 64(6), pages 751-772, December.
    15. Naruemon Choochinprakarn, 2015. "Strategic Uses of Electronic Commerce for Thai Travel Small and Medium Enterprises (SMEs)," Proceedings of Business and Management Conferences 2303915, International Institute of Social and Economic Sciences.
    16. Ramzi El-Haddadeh, 0. "Digital Innovation Dynamics Influence on Organisational Adoption: The Case of Cloud Computing Services," Information Systems Frontiers, Springer, vol. 0, pages 1-15.
    17. Shuai Sang & Lu Li, 2024. "A Novel Variant of LSTM Stock Prediction Method Incorporating Attention Mechanism," Mathematics, MDPI, vol. 12(7), pages 1-20, March.
    18. Elissa Dwi Lestari, 2022. "The Effect of Financial Literacy, Cost of Technology Adoption, Technology Perceived Usefulness, and Government Support on MSMEs' Business Resilience ," GATR Journals gjbssr620, Global Academy of Training and Research (GATR) Enterprise.
    19. Vladimir Franki & Darin Majnarić & Alfredo Višković, 2023. "A Comprehensive Review of Artificial Intelligence (AI) Companies in the Power Sector," Energies, MDPI, vol. 16(3), pages 1-35, January.
    20. Teo, Thompson S.H. & Lin, Sijie & Lai, Kee-hung, 2009. "Adopters and non-adopters of e-procurement in Singapore: An empirical study," Omega, Elsevier, vol. 37(5), pages 972-987, October.

    More about this item

    Keywords

    Data understanding; Data governance; Metadata generation;
    All these keywords.

    JEL classification:

    • M15 - Business Administration and Business Economics; Marketing; Accounting; Personnel Economics - - Business Administration - - - IT Management
    • L6 - Industrial Organization - - Industry Studies: Manufacturing

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:elmark:v:33:y:2023:i:1:d:10.1007_s12525-023-00677-w. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.