IDEAS home Printed from https://ideas.repec.org/a/gam/jagris/v15y2025i9p984-d1647723.html
   My bibliography  Save this article

Cereal and Rapeseed Yield Forecast in Poland at Regional Level Using Machine Learning and Classical Statistical Models

Author

Listed:
  • Edyta Okupska

    (Seed and Agricultural Farm, “Bovinas” Ltd., Chodow 17, Chodow, 62-652 Poznań, Poland)

  • Dariusz Gozdowski

    (Department of Biometry, Institute of Agriculture, Warsaw University of Life Sciences, Nowoursynowska 159, 02-776 Warsaw, Poland)

  • Rafał Pudełko

    (Department of Bioeconomy and Systems Analysis, Institute of Soil Science and Plant Cultivation—State Research Institute (IUNG-PIB), Czartoryskich 8, 24-100 Pulawy, Poland)

  • Elżbieta Wójcik-Gront

    (Department of Biometry, Institute of Agriculture, Warsaw University of Life Sciences, Nowoursynowska 159, 02-776 Warsaw, Poland)

Abstract

This study performed in-season yield prediction, about 2–3 months before the harvest, for cereals and rapeseed at the province level in Poland for 2009–2024. Various models were employed, including machine learning algorithms and multiple linear regression. The satellite-derived normalized difference vegetation index (NDVI) and climatic water balance (CWB), calculated using meteorological data, were treated as predictors of crop yield. The accuracy of the models was compared to identify the optimal approach. The strongest correlation coefficients with crop yield were observed for the NDVI at the beginning of March, ranging from 0.454 for rapeseed to 0.503 for rye. Depending on the crop, the highest R 2 values were observed for different prediction models, ranging from 0.654 for rapeseed based on the random forest model to 0.777 for basic cereals based on linear regression. The random forest model was best for rapeseed yield, while for cereal, the best prediction was observed for multiple linear regression or neural network models. For the studied crops, all models had mean absolute errors and root mean squared errors not exceeding 6 dt/ha, which is relatively small because it is under 20% of the mean yield. For the best models, in most cases, relative errors were not higher than 10% of the mean yield. The results proved that linear regression and machine learning models are characterized by similar predictions, likely due to the relatively small sample size (256 observations).

Suggested Citation

  • Edyta Okupska & Dariusz Gozdowski & Rafał Pudełko & Elżbieta Wójcik-Gront, 2025. "Cereal and Rapeseed Yield Forecast in Poland at Regional Level Using Machine Learning and Classical Statistical Models," Agriculture, MDPI, vol. 15(9), pages 1-16, May.
  • Handle: RePEc:gam:jagris:v:15:y:2025:i:9:p:984-:d:1647723
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2077-0472/15/9/984/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2077-0472/15/9/984/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jagris:v:15:y:2025:i:9:p:984-:d:1647723. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.