IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v17y2025i8p3727-d1638701.html
   My bibliography  Save this article

Design and Implementation of a Scalable Data Warehouse for Agricultural Big Data

Author

Listed:
  • Asterios Theofilou

    (Department of Agricultural Economics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece)

  • Stefanos A. Nastis

    (Department of Agricultural Economics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece)

  • Michail Tsagris

    (Department of Economics, University of Crete, 74100 Rethymno, Greece)

  • Santiago Rodriguez-Perez

    (Biotechnology Applications, IDENER, Early Ovington 24 Nave 8-9, 41300 Seville, Spain)

  • Konstadinos Mattas

    (Department of Agricultural Economics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece)

Abstract

The rapid growth of agricultural data necessitates the development of storage systems that are scalable and efficient in storing, retrieving and analyzing very large datasets. The traditional relational database management systems (RDBMSs) struggle to keep up with large-scale analytical queries due to the volume and complexity inherent in those data. This study presents the design and implementation of a scalable data warehouse (DWH) system for agricultural big data. The proposed solution efficiently integrates data and optimizes data ingestion, transformation, and query performance, leveraging a distributed architecture based on HDFS, Apache Hive, and Apache Spark, deployed on dockerized Ubuntu Linux environments. This paper highlights the reasons why a DWH is irreplaceable for big data processing, without disputing the strengths of traditional databases in transactional use cases. By detailing the architectural choices and implementation strategy, this study provides a practical framework for deploying robust DWH solutions that are useful in supporting agricultural research, market predictions and policy decision-making.

Suggested Citation

  • Asterios Theofilou & Stefanos A. Nastis & Michail Tsagris & Santiago Rodriguez-Perez & Konstadinos Mattas, 2025. "Design and Implementation of a Scalable Data Warehouse for Agricultural Big Data," Sustainability, MDPI, vol. 17(8), pages 1-19, April.
  • Handle: RePEc:gam:jsusta:v:17:y:2025:i:8:p:3727-:d:1638701
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/17/8/3727/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/17/8/3727/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:17:y:2025:i:8:p:3727-:d:1638701. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.