IDEAS home Printed from https://ideas.repec.org/a/gam/jagris/v15y2025i9p984-d1647723.html
   My bibliography  Save this article

Cereal and Rapeseed Yield Forecast in Poland at Regional Level Using Machine Learning and Classical Statistical Models

Author

Listed:
  • Edyta Okupska

    (Seed and Agricultural Farm, “Bovinas” Ltd., Chodow 17, Chodow, 62-652 Poznań, Poland)

  • Dariusz Gozdowski

    (Department of Biometry, Institute of Agriculture, Warsaw University of Life Sciences, Nowoursynowska 159, 02-776 Warsaw, Poland)

  • Rafał Pudełko

    (Department of Bioeconomy and Systems Analysis, Institute of Soil Science and Plant Cultivation—State Research Institute (IUNG-PIB), Czartoryskich 8, 24-100 Pulawy, Poland)

  • Elżbieta Wójcik-Gront

    (Department of Biometry, Institute of Agriculture, Warsaw University of Life Sciences, Nowoursynowska 159, 02-776 Warsaw, Poland)

Abstract

This study performed in-season yield prediction, about 2–3 months before the harvest, for cereals and rapeseed at the province level in Poland for 2009–2024. Various models were employed, including machine learning algorithms and multiple linear regression. The satellite-derived normalized difference vegetation index (NDVI) and climatic water balance (CWB), calculated using meteorological data, were treated as predictors of crop yield. The accuracy of the models was compared to identify the optimal approach. The strongest correlation coefficients with crop yield were observed for the NDVI at the beginning of March, ranging from 0.454 for rapeseed to 0.503 for rye. Depending on the crop, the highest R 2 values were observed for different prediction models, ranging from 0.654 for rapeseed based on the random forest model to 0.777 for basic cereals based on linear regression. The random forest model was best for rapeseed yield, while for cereal, the best prediction was observed for multiple linear regression or neural network models. For the studied crops, all models had mean absolute errors and root mean squared errors not exceeding 6 dt/ha, which is relatively small because it is under 20% of the mean yield. For the best models, in most cases, relative errors were not higher than 10% of the mean yield. The results proved that linear regression and machine learning models are characterized by similar predictions, likely due to the relatively small sample size (256 observations).

Suggested Citation

  • Edyta Okupska & Dariusz Gozdowski & Rafał Pudełko & Elżbieta Wójcik-Gront, 2025. "Cereal and Rapeseed Yield Forecast in Poland at Regional Level Using Machine Learning and Classical Statistical Models," Agriculture, MDPI, vol. 15(9), pages 1-16, May.
  • Handle: RePEc:gam:jagris:v:15:y:2025:i:9:p:984-:d:1647723
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2077-0472/15/9/984/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2077-0472/15/9/984/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Lecerf, Rémi & Ceglar, Andrej & López-Lozano, Raúl & Van Der Velde, Marijn & Baruth, Bettina, 2019. "Assessing the information in crop model and meteorological indicators to forecast crop yield over Europe," Agricultural Systems, Elsevier, vol. 168(C), pages 191-202.
    2. Paudel, Dilli & Boogaard, Hendrik & de Wit, Allard & Janssen, Sander & Osinga, Sjoukje & Pylianidis, Christos & Athanasiadis, Ioannis N., 2021. "Machine learning for large-scale crop yield forecasting," Agricultural Systems, Elsevier, vol. 187(C).
    3. Folberth, Christian & Yang, Hong & Wang, Xiuying & Abbaspour, Karim C., 2012. "Impact of input data resolution and extent of harvested areas on crop yield estimates in large-scale agricultural modeling for maize in the USA," Ecological Modelling, Elsevier, vol. 235, pages 8-18.
    4. Clark, Robert & Dahlhaus, Peter & Robinson, Nathan & Larkins, Jo-ann & Morse-McNabb, Elizabeth, 2023. "Matching the model to the available data to predict wheat, barley, or canola yield: A review of recently published models and data," Agricultural Systems, Elsevier, vol. 211(C).
    5. Renata Kuśmierek-Tomaszewska & Jacek Żarski, 2021. "Assessment of Meteorological and Agricultural Drought Occurrence in Central Poland in 1961–2020 as an Element of the Climatic Risk to Crop Production," Agriculture, MDPI, vol. 11(9), pages 1-17, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gaona, Jaime & Benito-Verdugo, Pilar & Martínez-Fernández, José & González-Zamora, Ángel & Almendra-Martín, Laura & Herrero-Jiménez, Carlos Miguel, 2023. "Predictive value of soil moisture and concurrent variables in the multivariate modelling of cereal yields in water-limited environments," Agricultural Water Management, Elsevier, vol. 282(C).
    2. Oyenike Mary Olanrewaju & Eli Adama Jiya & Faith Oluwatosin Echobu, 2024. "Intelligent Maize Yield Prediction Model Based on Plant Attributes and Machine Learning Algorithms," International Journal of Research and Scientific Innovation, International Journal of Research and Scientific Innovation (IJRSI), vol. 11(7), pages 1097-1104, July.
    3. Zhou, Hongkui & Huang, Fudeng & Lou, Weidong & Gu, Qing & Ye, Ziran & Hu, Hao & Zhang, Xiaobin, 2025. "Yield prediction through UAV-based multispectral imaging and deep learning in rice breeding trials," Agricultural Systems, Elsevier, vol. 223(C).
    4. Yan, Ling & Jin, Jiming & Wu, Pute, 2020. "Impact of parameter uncertainty and water stress parameterization on wheat growth simulations using CERES-Wheat with GLUE," Agricultural Systems, Elsevier, vol. 181(C).
    5. Schmidt, Lorenz & Odening, Martin & Schlanstein, Johann & Ritter, Matthias, 2021. "Estimation of the Farm-Level Yield-Weather-Relation Using Machine Learning," 61st Annual Conference, Berlin, Germany, September 22-24, 2021 317075, German Association of Agricultural Economists (GEWISOLA).
    6. Pavithra Mahesh & Rajkumar Soundrapandiyan, 2024. "Yield prediction for crops by gradient-based algorithms," PLOS ONE, Public Library of Science, vol. 19(8), pages 1-20, August.
    7. Timsina, Jagadish & Dutta, Sudarshan & Devkota, Krishna Prasad & Chakraborty, Somsubhra & Neupane, Ram Krishna & Bishta, Sudarshan & Amgain, Lal Prasad & Singh, Vinod K. & Islam, Saiful & Majumdar, Ka, 2021. "Improved nutrient management in cereals using Nutrient Expert and machine learning tools: Productivity, profitability and nutrient use efficiency," Agricultural Systems, Elsevier, vol. 192(C).
    8. Amba Shalishe & Anirudh Bhowmick & Kumneger Elias, 2023. "Agricultural drought analysis and its association among land surface temperature, soil moisture and precipitation in Gamo Zone, Southern Ethiopia: a remote sensing approach," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 117(1), pages 57-70, May.
    9. Nandan, Rohit & Bandaru, Varaprasad & Meduri, Pridhvi & Jones, Curtis & Lollato, Romulo, 2024. "Evaluating the utility of weather generators in crop simulation models for in-season yield forecasting," Agricultural Systems, Elsevier, vol. 220(C).
    10. Kouame, Anselme K.K. & Bindraban, Prem S. & Kissiedu, Isaac N. & Atakora, Williams K. & El Mejahed, Khalil, 2023. "Identifying drivers for variability in maize (Zea mays L.) yield in Ghana: A meta-regression approach," Agricultural Systems, Elsevier, vol. 209(C).
    11. repec:plo:pone00:0151782 is not listed on IDEAS
    12. Potopová, Vera & Trnka, Miroslav & Hamouz, Pavel & Soukup, Josef & Castraveț, Tudor, 2020. "Statistical modelling of drought-related yield losses using soil moisture-vegetation remote sensing and multiscalar indices in the south-eastern Europe," Agricultural Water Management, Elsevier, vol. 236(C).
    13. Kalpana Jain & Naveen Choudhary, 2022. "Comparative analysis of machine learning techniques for predicting production capability of crop yield," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(1), pages 583-593, March.
    14. Wu, Bingfang & Ma, Zonghan & Boken, Vijendra K. & Zeng, Hongwei & Shang, Jiali & Igor, Savin & Wang, Jinxia & Yan, Nana, 2022. "Regional differences in the performance of drought mitigation measures in 12 major wheat-growing regions of the world," Agricultural Water Management, Elsevier, vol. 273(C).
    15. Sebastian C. Ibañez & Christopher P. Monterola, 2023. "A Global Forecasting Approach to Large-Scale Crop Production Prediction with Time Series Transformers," Agriculture, MDPI, vol. 13(9), pages 1-27, September.
    16. Oumnia Ennaji & Sfia Baha & Leonardus Vergutz & Achraf El Allali, 2024. "Gradient boosting for yield prediction of elite maize hybrid ZhengDan 958," PLOS ONE, Public Library of Science, vol. 19(12), pages 1-16, December.
    17. Iwona Jaskulska & Jarosław Kamieniarz & Dariusz Jaskulski & Maja Radziemska & Martin Brtnický, 2023. "Fungicidal Protection as Part of the Integrated Cultivation of Sugar Beet: An Assessment of the Influence on Root Yield in a Long-Term Study," Agriculture, MDPI, vol. 13(7), pages 1-10, July.
    18. Yingnan Wei & Han Ru & Xiaolan Leng & Zhijian He & Olusola O. Ayantobo & Tehseen Javed & Ning Yao, 2022. "Better Performance of the Modified CERES-Wheat Model in Simulating Evapotranspiration and Wheat Growth under Water Stress Conditions," Agriculture, MDPI, vol. 12(11), pages 1-15, November.
    19. Paudel, Dilli & Boogaard, Hendrik & de Wit, Allard & Janssen, Sander & Osinga, Sjoukje & Pylianidis, Christos & Athanasiadis, Ioannis N., 2021. "Machine learning for large-scale crop yield forecasting," Agricultural Systems, Elsevier, vol. 187(C).
    20. Ghulam Mustafa & Muhammad Ali Moazzam & Asif Nawaz & Tariq Ali & Deema Mohammed Alsekait & Ahmed Saleh Alattas & Diaa Salama AbdElminaam, 2025. "ECP-IEM: Enhancing seasonal crop productivity with deep integrated models," PLOS ONE, Public Library of Science, vol. 20(2), pages 1-23, February.
    21. Ihsan F. Hasan & Rozi Abdullah, 2022. "Agricultural Drought Characteristics Analysis Using Copula," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 36(15), pages 5915-5930, December.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jagris:v:15:y:2025:i:9:p:984-:d:1647723. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.