IDEAS home Printed from https://ideas.repec.org/a/gam/jagris/v12y2022i12p2089-d994653.html
   My bibliography  Save this article

Prediction of Blueberry ( Vaccinium corymbosum L.) Yield Based on Artificial Intelligence Methods

Author

Listed:
  • Gniewko Niedbała

    (Department of Biosystems Engineering, Faculty of Environmental and Mechanical Engineering, Poznań University of Life Sciences, Wojska Polskiego 50, 60-627 Poznań, Poland)

  • Jarosław Kurek

    (Department of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences, Nowoursynowska 159, 02-776 Warsaw, Poland)

  • Bartosz Świderski

    (Department of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences, Nowoursynowska 159, 02-776 Warsaw, Poland)

  • Tomasz Wojciechowski

    (Department of Biosystems Engineering, Faculty of Environmental and Mechanical Engineering, Poznań University of Life Sciences, Wojska Polskiego 50, 60-627 Poznań, Poland)

  • Izabella Antoniuk

    (Department of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences, Nowoursynowska 159, 02-776 Warsaw, Poland)

  • Krzysztof Bobran

    (Seth Software sp. z o.o., Strefowa 1, 36-060 Głogów Małopolski, Poland)

Abstract

In this paper, we present a high-accuracy model for blueberry yield prediction, trained using structurally innovative data sets. Blueberries are blooming plants, valued for their antioxidant and anti-inflammatory properties. Yield on the plantations depends on several factors, both internal and external. Predicting the accurate amount of harvest is an important aspect in work planning and storage space selection. Machine learning algorithms are commonly used in such prediction tasks, since they are capable of finding correlations between various factors at play. Overall data were collected from years 2016–2021, and included agronomic, climatic and soil data as well satellite-imaging vegetation data. Additionally, growing periods according to BBCH scale and aggregates were taken into account. After extensive data preprocessing and obtaining cumulative features, a total of 11 models were trained and evaluated. Chosen classifiers were selected from state-of-the-art methods in similar applications. To evaluate the results, Mean Absolute Percentage Error was chosen. It is superior to alternatives, since it takes into account absolute values, negating the risk that opposite variables will cancel out, while the final result outlines percentage difference between the actual value and prediction. Regarding the research presented, the best performing solution proved to be Extreme Gradient Boosting algorithm, with MAPE value equal to 12.48%. This result meets the requirements of practical applications, with sufficient accuracy to improve the overall yield management process. Due to the nature of machine learning methodology, the presented solution can be further improved with annually collected data.

Suggested Citation

  • Gniewko Niedbała & Jarosław Kurek & Bartosz Świderski & Tomasz Wojciechowski & Izabella Antoniuk & Krzysztof Bobran, 2022. "Prediction of Blueberry ( Vaccinium corymbosum L.) Yield Based on Artificial Intelligence Methods," Agriculture, MDPI, vol. 12(12), pages 1-27, December.
  • Handle: RePEc:gam:jagris:v:12:y:2022:i:12:p:2089-:d:994653
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2077-0472/12/12/2089/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2077-0472/12/12/2089/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Humna Khan & Travis J. Esau & Aitazaz A. Farooque & Farhat Abbas, 2022. "Wild Blueberry Harvesting Losses Predicted with Selective Machine Learning Algorithms," Agriculture, MDPI, vol. 12(10), pages 1-15, October.
    2. Mohsen Sabzi-Nojadeh & Gniewko Niedbała & Mehdi Younessi-Hamzekhanlu & Saeid Aharizad & Mohammad Esmaeilpour & Moslem Abdipour & Sebastian Kujawa & Mohsen Niazian, 2021. "Modeling the Essential Oil and Trans -Anethole Yield of Fennel ( Foeniculum vulgare Mill. var. vulgare ) by Application Artificial Neural Network and Multiple Linear Regression Methods," Agriculture, MDPI, vol. 11(12), pages 1-17, November.
    3. Józef Gorzelany & Justyna Belcar & Piotr Kuźniar & Gniewko Niedbała & Katarzyna Pentoś, 2022. "Modelling of Mechanical Properties of Fresh and Stored Fruit of Large Cranberry Using Multiple Linear Regression and Machine Learning," Agriculture, MDPI, vol. 12(2), pages 1-13, January.
    4. Heard, Nicholas A. & Holmes, Christopher C. & Stephens, David A., 2006. "A Quantitative Study of Gene Regulation Involved in the Immune Response of Anopheline Mosquitoes: An Application of Bayesian Hierarchical Clustering of Curves," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 18-29, March.
    5. Phipps Arabie & J. Carroll, 1980. "Mapclus: A mathematical programming approach to fitting the adclus model," Psychometrika, Springer;The Psychometric Society, vol. 45(2), pages 211-235, June.
    6. Patryk Hara & Magdalena Piekutowska & Gniewko Niedbała, 2021. "Selection of Independent Variables for Crop Yield Prediction Using Artificial Neural Network Models with Remote Sensing Data," Land, MDPI, vol. 10(6), pages 1-21, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gniewko Niedbała & Danuta Kurasiak-Popowska & Magdalena Piekutowska & Tomasz Wojciechowski & Michał Kwiatek & Jerzy Nawracała, 2022. "Application of Artificial Neural Network Sensitivity Analysis to Identify Key Determinants of Harvesting Date and Yield of Soybean ( Glycine max [L.] Merrill) Cultivar Augusta," Agriculture, MDPI, vol. 12(6), pages 1-17, May.
    2. Patryk Hara & Magdalena Piekutowska & Gniewko Niedbała, 2022. "Prediction of Protein Content in Pea ( Pisum sativum L.) Seeds Using Artificial Neural Networks," Agriculture, MDPI, vol. 13(1), pages 1-21, December.
    3. Patryk Hara & Magdalena Piekutowska & Gniewko Niedbała, 2023. "Prediction of Pea ( Pisum sativum L.) Seeds Yield Using Artificial Neural Networks," Agriculture, MDPI, vol. 13(3), pages 1-19, March.
    4. Piotr Boniecki & Agnieszka Sujak & Gniewko Niedbała & Hanna Piekarska-Boniecka & Agnieszka Wawrzyniak & Andrzej Przybylak, 2023. "Neural Modelling from the Perspective of Selected Statistical Methods on Examples of Agricultural Applications," Agriculture, MDPI, vol. 13(4), pages 1-19, March.
    5. Angelini, Claudia & De Canditiis, Daniela & Pensky, Marianna, 2009. "Bayesian models for two-sample time-course microarray experiments," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1547-1565, March.
    6. Jian Wang & Haiping Si & Zhao Gao & Lei Shi, 2022. "Winter Wheat Yield Prediction Using an LSTM Model from MODIS LAI Products," Agriculture, MDPI, vol. 12(10), pages 1-13, October.
    7. Wayne DeSarbo & J. Carroll & Linda Clark & Paul Green, 1984. "Synthesized clustering: A method for amalgamating alternative clustering bases with differential weighting of variables," Psychometrika, Springer;The Psychometric Society, vol. 49(1), pages 57-78, March.
    8. Ando, Tomohiro & Bai, Jushan, 2021. "Large-scale generalized linear longitudinal data models with grouped patterns of unobserved heterogeneity," MPRA Paper 111431, University Library of Munich, Germany.
    9. M. Vrac & L. Billard & E. Diday & A. Chédin, 2012. "Copula analysis of mixture models," Computational Statistics, Springer, vol. 27(3), pages 427-457, September.
    10. J. Carroll & James Corter, 1995. "A graph-theoretic method for organizing overlapping clusters into trees, multiple trees, or extended trees," Journal of Classification, Springer;The Classification Society, vol. 12(2), pages 283-313, September.
    11. Bruno Scarpa & David B. Dunson, 2009. "Bayesian Hierarchical Functional Data Analysis Via Contaminated Informative Priors," Biometrics, The International Biometric Society, vol. 65(3), pages 772-780, September.
    12. Satoru Yokoyama & Atsuho Nakayama & Akinori Okada, 2009. "One-mode three-way overlapping cluster analysis," Computational Statistics, Springer, vol. 24(1), pages 165-179, February.
    13. Jim Q. Smith & Paul E. Anderson & Silvia Liverani, 2008. "Separation measures and the geometry of Bayes factor selection for classification," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 957-980, November.
    14. J. Carroll & Linda Clark & Wayne DeSarbo, 1984. "The representation of three-way proximity data by single and multiple tree structure models," Journal of Classification, Springer;The Classification Society, vol. 1(1), pages 25-74, December.
    15. Yoshio Takane & Justine Sergent, 1983. "Multidimensional scaling models for reaction times and same-different judgments," Psychometrika, Springer;The Psychometric Society, vol. 48(3), pages 393-423, September.
    16. Joachim Harloff, 2011. "Extracting cover sets from free fuzzy sorting data," Quality & Quantity: International Journal of Methodology, Springer, vol. 45(6), pages 1445-1457, October.
    17. Christos Vasilakos & George E. Tsekouras & Dimitris Kavroudakis, 2022. "LSTM-Based Prediction of Mediterranean Vegetation Dynamics Using NDVI Time-Series Data," Land, MDPI, vol. 11(6), pages 1-23, June.
    18. repec:jss:jstsof:47:i05 is not listed on IDEAS
    19. Daewon Yang & Taeryon Choi & Eric Lavigne & Yeonseung Chung, 2022. "Non‐parametric Bayesian covariate‐dependent multivariate functional clustering: An application to time‐series data for multiple air pollutants," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1521-1542, November.
    20. Giuseppe Bove & Akinori Okada, 2018. "Methods for the analysis of asymmetric pairwise relationships," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(1), pages 5-31, March.
    21. Wang, Ying & Shi, Wenjuan & Wen, Tianyang, 2023. "Prediction of winter wheat yield and dry matter in North China Plain using machine learning algorithms for optimal water and nitrogen application," Agricultural Water Management, Elsevier, vol. 277(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jagris:v:12:y:2022:i:12:p:2089-:d:994653. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.