IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v15y2023i3p2786-d1056816.html
   My bibliography  Save this article

Comparative Analysis of Statistical and Machine Learning Techniques for Rice Yield Forecasting for Chhattisgarh, India

Author

Listed:
  • Anurag Satpathi

    (Department of Agrometeorology, College of Agriculture, G.B. Pant University of Agriculture and Technology, Pantnagar 263153, India)

  • Parul Setiya

    (Department of Agrometeorology, College of Agriculture, G.B. Pant University of Agriculture and Technology, Pantnagar 263153, India)

  • Bappa Das

    (ICAR Central Coastal Agricultural Research Institute, Old Goa 403402, India)

  • Ajeet Singh Nain

    (Department of Agrometeorology, College of Agriculture, G.B. Pant University of Agriculture and Technology, Pantnagar 263153, India)

  • Prakash Kumar Jha

    (Sustainable Intensification Innovation Lab, Kansas State University, Manhattan, KS 66506, USA)

  • Surendra Singh

    (Columbia Basin Agricultural Research Center, Oregon State University, Adams, OR 97810, USA)

  • Shikha Singh

    (Hermiston Agricultural Research and Extension Center, Oregon State University, Hermiston, OR 97838, USA)

Abstract

Crop yield forecasting before harvesting is critical for the creation, implementation, and optimization of policies related to food safety as well as for agro-product storage and marketing. Crop growth and development are influenced by the weather. Therefore, models using weather variables can provide reliable predictions of crop yields. It can be tough to select the best crop production forecasting model. Therefore, in this study, five alternative models, viz., stepwise multiple linear regression (SMLR), an artificial neural network (ANN), the least absolute shrinkage and selection operator (LASSO), an elastic net (ELNET), and ridge regression, were compared in order to discover the best model for rice yield prediction. The outputs from individual models were used to build ensemble models using the generalized linear model (GLM), random forest (RF), cubist and ELNET methods. For the previous 21 years, historical rice yield statistics and meteorological data were collected for three districts under three separate agro-climatic zones of Chhattisgarh, viz., Raipur in the Chhattisgarh plains, Surguja in the northern hills, and Bastar in the southern plateau. The models were calibrated using 80% of these datasets, and the remaining 20% was used for the validation of models. The present study concluded that for rice crop yield forecasting, the performance of the ANN was good for the Raipur ( R c a l 2 = 1, R v a l 2 = 1 and R M S E c a l = 0.002, R M S E v a l = 0.003) and Surguja ( R c a l 2 = 1, R v a l 2 = 0.99 and R M S E c a l = 0.004, R M S E v a l = 0.214) districts as compared to the other models, whereas for Bastar, ELNET ( R c a l 2 = 90, R v a l 2 = 0.48) and LASSO ( R c a l 2 = 93, R v a l 2 = 0.568) performed better. The performance of the ensemble model was better compared to the individual models. For Raipur and Surguja, the performance of all the ensemble methods was comparable, whereas for Bastar, random forest (RF) performed better, with R 2 = 0.85 and 0.81 for calibration and validation, respectively, as compared to the GLM, cubist, and ELNET approach.

Suggested Citation

  • Anurag Satpathi & Parul Setiya & Bappa Das & Ajeet Singh Nain & Prakash Kumar Jha & Surendra Singh & Shikha Singh, 2023. "Comparative Analysis of Statistical and Machine Learning Techniques for Rice Yield Forecasting for Chhattisgarh, India," Sustainability, MDPI, vol. 15(3), pages 1-18, February.
  • Handle: RePEc:gam:jsusta:v:15:y:2023:i:3:p:2786-:d:1056816
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/15/3/2786/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/15/3/2786/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Kuhn, Max, 2008. "Building Predictive Models in R Using the caret Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 28(i05).
    2. Wentao Wang & Jiaxuan Liang & Rong Liu & Yunquan Song & Min Zhang, 2022. "A Robust Variable Selection Method for Sparse Online Regression via the Elastic Net Penalty," Mathematics, MDPI, vol. 10(16), pages 1-18, August.
    3. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    4. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    5. Johnathon Shook & Tryambak Gangopadhyay & Linjiang Wu & Baskar Ganapathysubramanian & Soumik Sarkar & Asheesh K Singh, 2021. "Crop yield prediction integrating genotype and weather variables using deep learning," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-19, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jin, Keyan & Zhong, Ziqi & Zhao, Elena Yifei, 2024. "Sustainable digital marketing under big data: an AI random forest model approach," LSE Research Online Documents on Economics 121402, London School of Economics and Political Science, LSE Library.
    2. Saim Khalid & Hadi Mohsen Oqaibi & Muhammad Aqib & Yaser Hafeez, 2023. "Small Pests Detection in Field Crops Using Deep Learning Object Detection," Sustainability, MDPI, vol. 15(8), pages 1-19, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Štefan Lyócsa & Petra Vašaničová & Branka Hadji Misheva & Marko Dávid Vateha, 2022. "Default or profit scoring credit systems? Evidence from European and US peer-to-peer lending markets," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-21, December.
    2. Zander S. Venter & Adam Sadilek & Charlotte Stanton & David N. Barton & Kristin Aunan & Sourangsu Chowdhury & Aaron Schneider & Stefano Maria Iacus, 2021. "Mobility in Blue-Green Spaces Does Not Predict COVID-19 Transmission: A Global Analysis," IJERPH, MDPI, vol. 18(23), pages 1-12, November.
    3. Yagli, Gokhan Mert & Yang, Dazhi & Srinivasan, Dipti, 2019. "Automatic hourly solar forecasting using machine learning models," Renewable and Sustainable Energy Reviews, Elsevier, vol. 105(C), pages 487-498.
    4. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    5. Paweł Teisseyre & Robert A. Kłopotek & Jan Mielniczuk, 2016. "Random Subspace Method for high-dimensional regression with the R package regRSM," Computational Statistics, Springer, vol. 31(3), pages 943-972, September.
    6. Satre-Meloy, Aven & Diakonova, Marina & Grünewald, Philipp, 2020. "Cluster analysis and prediction of residential peak demand profiles using occupant activity data," Applied Energy, Elsevier, vol. 260(C).
    7. Merlijn Breugel & Cancan Qi & Zhongli Xu & Casper-Emil T. Pedersen & Ilya Petoukhov & Judith M. Vonk & Ulrike Gehring & Marijn Berg & Marnix Bügel & Orestes A. Carpaij & Erick Forno & Andréanne Morin , 2022. "Nasal DNA methylation at three CpG sites predicts childhood allergic disease," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    8. Mielniczuk, Jan & Teisseyre, Paweł, 2014. "Using random subspace method for prediction and variable importance assessment in linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 725-742.
    9. Vera Wendler-Bosco & Charles Nicholson, 2022. "Modeling the economic impact of incoming tropical cyclones using machine learning," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 110(1), pages 487-518, January.
    10. A. Jiran Meitei & Akanksha Saini & Bibhuti Bhusan Mohapatra & Kh. Jitenkumar Singh, 2022. "Predicting child anaemia in the North-Eastern states of India: a machine learning approach," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(6), pages 2949-2962, December.
    11. Schroeders, Ulrich & Watrin, Luc & Wilhelm, Oliver, 2021. "Age-related nuances in knowledge assessment," Intelligence, Elsevier, vol. 85(C).
    12. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    13. Oxana Babecka Kucharcukova & Jan Bruha, 2016. "Nowcasting the Czech Trade Balance," Working Papers 2016/11, Czech National Bank.
    14. Carstensen, Kai & Heinrich, Markus & Reif, Magnus & Wolters, Maik H., 2020. "Predicting ordinary and severe recessions with a three-state Markov-switching dynamic factor model," International Journal of Forecasting, Elsevier, vol. 36(3), pages 829-850.
    15. Hou-Tai Chang & Ping-Huai Wang & Wei-Fang Chen & Chen-Ju Lin, 2022. "Risk Assessment of Early Lung Cancer with LDCT and Health Examinations," IJERPH, MDPI, vol. 19(8), pages 1-12, April.
    16. Margherita Giuzio, 2017. "Genetic algorithm versus classical methods in sparse index tracking," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 40(1), pages 243-256, November.
    17. Nicolaj N. Mühlbach, 2020. "Tree-based Synthetic Control Methods: Consequences of moving the US Embassy," CREATES Research Papers 2020-04, Department of Economics and Business Economics, Aarhus University.
    18. Wang, Qiao & Zhou, Wei & Cheng, Yonggang & Ma, Gang & Chang, Xiaolin & Miao, Yu & Chen, E, 2018. "Regularized moving least-square method and regularized improved interpolating moving least-square method with nonsingular moment matrices," Applied Mathematics and Computation, Elsevier, vol. 325(C), pages 120-145.
    19. Dmitriy Drusvyatskiy & Adrian S. Lewis, 2018. "Error Bounds, Quadratic Growth, and Linear Convergence of Proximal Methods," Mathematics of Operations Research, INFORMS, vol. 43(3), pages 919-948, August.
    20. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:15:y:2023:i:3:p:2786-:d:1056816. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.