IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v17y2025i13p6168-d1695199.html
   My bibliography  Save this article

Spatial Prediction of Soil Organic Carbon Based on a Multivariate Feature Set and Stacking Ensemble Algorithm: A Case Study of Wei-Ku Oasis in China

Author

Listed:
  • Zuming Cao

    (College of Geographic Science and Tourism, Xinjiang Normal University, Urumqi 830017, China
    Xinjiang Arid Zone Lake Environment and Resources Laboratory, Urumqi 830017, China)

  • Xiaowei Luo

    (College of Geographic Science and Tourism, Xinjiang Normal University, Urumqi 830017, China
    Xinjiang Arid Zone Lake Environment and Resources Laboratory, Urumqi 830017, China)

  • Xuemei Wang

    (College of Geographic Science and Tourism, Xinjiang Normal University, Urumqi 830017, China
    Xinjiang Arid Zone Lake Environment and Resources Laboratory, Urumqi 830017, China)

  • Dun Li

    (College of Geographic Science and Tourism, Xinjiang Normal University, Urumqi 830017, China
    Xinjiang Arid Zone Lake Environment and Resources Laboratory, Urumqi 830017, China)

Abstract

Accurate estimation of soil organic carbon (SOC) content is crucial for assessing terrestrial ecosystem carbon stocks. Although traditional methods offer relatively high estimation accuracy, they are limited by poor timeliness and high costs. Combining measured data, remote sensing technology, and machine learning (ML) algorithms enables rapid, efficient, and accurate large-scale prediction. However, single ML models often face issues like high feature variable redundancy and weak generalization ability. Integrated models can effectively overcome these problems. This study focuses on the Weigan–Kuqa River oasis (Wei-Ku Oasis), a typical arid oasis in northwest China. It integrates Sentinel-2A multispectral imagery, a digital elevation model, ERA5 meteorological reanalysis data, soil attribute, and land use (LU) data to estimate SOC. The Boruta algorithm, Lasso regression, and its combination methods were used to screen feature variables, constructing a multidimensional feature space. Ensemble models like Random Forest (RF), Gradient Boosting Machine (GBM), and the Stacking model are built. Results show that the Stacking model, constructed by combining the screened variable sets, exhibited optimal prediction accuracy (test set R 2 = 0.61, RMSE = 2.17 g∙kg −1 , RPD = 1.61), which reduced the prediction error by 9% compared to single model prediction. Difference Vegetation Index (DVI), Bare Soil Evapotranspiration (BSE), and type of land use (TLU) have a substantial multidimensional synergistic influence on the spatial differentiation pattern of the SOC. The implementation of TLU has been demonstrated to exert a substantial influence on the model’s estimation performance, as evidenced by an augmentation of 24% in the R 2 of the test set. The integration of Boruta–Lasso combination screening and Stacking has been shown to facilitate the construction of a high-precision SOC content estimation model. This model has the capacity to provide technical support for precision fertilization in oasis regions in arid zones and the management of regional carbon sinks.

Suggested Citation

  • Zuming Cao & Xiaowei Luo & Xuemei Wang & Dun Li, 2025. "Spatial Prediction of Soil Organic Carbon Based on a Multivariate Feature Set and Stacking Ensemble Algorithm: A Case Study of Wei-Ku Oasis in China," Sustainability, MDPI, vol. 17(13), pages 1-25, July.
  • Handle: RePEc:gam:jsusta:v:17:y:2025:i:13:p:6168-:d:1695199
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/17/13/6168/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/17/13/6168/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Nuno Carvalhais & Matthias Forkel & Myroslava Khomik & Jessica Bellarby & Martin Jung & Mirco Migliavacca & Mingquan Μu & Sassan Saatchi & Maurizio Santoro & Martin Thurner & Ulrich Weber & Bernhard A, 2014. "Global covariation of carbon turnover times with climate in terrestrial ecosystems," Nature, Nature, vol. 514(7521), pages 213-217, October.
    2. Mukhtar Iderawumi Abdulraheem & Wei Zhang & Shixin Li & Ata Jahangir Moshayedi & Aitazaz A. Farooque & Jiandong Hu, 2023. "Advancement of Remote Sensing for Soil Measurements and Applications: A Comprehensive Review," Sustainability, MDPI, vol. 15(21), pages 1-32, October.
    3. Kursa, Miron B. & Rudnicki, Witold R., 2010. "Feature Selection with the Boruta Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 36(i11).
    4. Rodrigues, Eugénio & Gomes, Álvaro & Gaspar, Adélio Rodrigues & Henggeler Antunes, Carlos, 2018. "Estimation of renewable energy and built environment-related variables using neural networks – A review," Renewable and Sustainable Energy Reviews, Elsevier, vol. 94(C), pages 959-988.
    5. Chai, Xuqing & Li, Shihao & Liang, Fengwei, 2024. "A novel battery SOC estimation method based on random search optimized LSTM neural network," Energy, Elsevier, vol. 306(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tong, Jianfeng & Liu, Zhenxing & Zhang, Yong & Zheng, Xiujuan & Jin, Junyang, 2023. "Improved multi-gate mixture-of-experts framework for multi-step prediction of gas load," Energy, Elsevier, vol. 282(C).
    2. Asma Shaheen & Javed Iqbal, 2018. "Spatial Distribution and Mobility Assessment of Carcinogenic Heavy Metals in Soil Profiles Using Geostatistics and Random Forest, Boruta Algorithm," Sustainability, MDPI, vol. 10(3), pages 1-20, March.
    3. Ramón Ferri-García & María del Mar Rueda, 2022. "Variable selection in Propensity Score Adjustment to mitigate selection bias in online surveys," Statistical Papers, Springer, vol. 63(6), pages 1829-1881, December.
    4. Yang Zhao & Denise Gorse, 2024. "Earthquake prediction from seismic indicators using tree-based ensemble learning," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 120(3), pages 2283-2309, February.
    5. Zhongen Niu & Huimin Yan & Fang Liu, 2020. "Decreasing Cropping Intensity Dominated the Negative Trend of Cropland Productivity in Southern China in 2000–2015," Sustainability, MDPI, vol. 12(23), pages 1-14, December.
    6. Manuel J. García Rodríguez & Vicente Rodríguez Montequín & Francisco Ortega Fernández & Joaquín M. Villanueva Balsera, 2019. "Public Procurement Announcements in Spain: Regulations, Data Analysis, and Award Price Estimator Using Machine Learning," Complexity, Hindawi, vol. 2019, pages 1-20, November.
    7. Sangjin Kim & Jong-Min Kim, 2019. "Two-Stage Classification with SIS Using a New Filter Ranking Method in High Throughput Data," Mathematics, MDPI, vol. 7(6), pages 1-16, May.
    8. Baihan Wang & Alfred Pozarickij & Mohsen Mazidi & Neil Wright & Pang Yao & Saredo Said & Andri Iona & Christiana Kartsonaki & Hannah Fry & Kuang Lin & Yiping Chen & Huaidong Du & Daniel Avery & Dan Sc, 2025. "Comparative studies of 2168 plasma proteins measured by two affinity-based platforms in 4000 Chinese adults," Nature Communications, Nature, vol. 16(1), pages 1-13, December.
    9. Foutzopoulos, Giorgos & Pandis, Nikolaos & Tsagris, Michail, 2024. "Predicting full retirement attainment of NBA players," MPRA Paper 121540, University Library of Munich, Germany.
    10. Zhao-Yue Chen & Hervé Petetin & Raúl Fernando Méndez Turrubiates & Hicham Achebak & Carlos Pérez García-Pando & Joan Ballester, 2024. "Population exposure to multiple air pollutants and its compound episodes in Europe," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    11. Schrader, Silja & Graham, Sonia & Campbell, Rebecca & Height, Kaitlyn & Hawkes, Gina, 2024. "Grower attitudes and practices toward area-wide management of cropping weeds in Australia," Land Use Policy, Elsevier, vol. 137(C).
    12. Yao Wang & Yimin Chen & Xinyuan Wang & Baiting Zhang & Yining Sun & Yuhan Zhang & Yuxuan Li & Yueyu Sui & Yingjie Dai, 2025. "Characteristics of the Spatiotemporal Evolution and Driving Mechanisms of Soil Organic Matter in the Songnen Plain in China," Agriculture, MDPI, vol. 15(20), pages 1-16, October.
    13. Rabin K. Jana & Indranil Ghosh, 2025. "A residual driven ensemble machine learning approach for forecasting natural gas prices: analyses for pre-and during-COVID-19 phases," Annals of Operations Research, Springer, vol. 345(2), pages 757-778, February.
    14. Piotr Pomorski & Denise Gorse, 2023. "Improving Portfolio Performance Using a Novel Method for Predicting Financial Regimes," Papers 2310.04536, arXiv.org.
    15. Caperna, Giulio & Colagrossi, Marco & Geraci, Andrea & Mazzarella, Gianluca, 2022. "A babel of web-searches: Googling unemployment during the pandemic," Labour Economics, Elsevier, vol. 74(C).
    16. Hakan Pabuccu & Adrian Barbu, 2023. "Feature Selection with Annealing for Forecasting Financial Time Series," Papers 2303.02223, arXiv.org, revised Feb 2024.
    17. Abolfazl Mollalo & Kiara M. Rivera & Behzad Vahedi, 2020. "Artificial Neural Network Modeling of Novel Coronavirus (COVID-19) Incidence Rates across the Continental United States," IJERPH, MDPI, vol. 17(12), pages 1-13, June.
    18. Pooja Preetha & Naveen Joseph, 2025. "Evaluating Modified Soil Erodibility Factors with the Aid of Pedotransfer Functions and Dynamic Remote-Sensing Data for Soil Health Management," Land, MDPI, vol. 14(3), pages 1-22, March.
    19. Chunyang Huang & Shaoliang Zhang, 2023. "Explainable artificial intelligence model for identifying Market Value in Professional Soccer Players," Papers 2311.04599, arXiv.org, revised Nov 2023.
    20. Yonghua Li & Song Yao & Hezhou Jiang & Huarong Wang & Qinchuan Ran & Xinyun Gao & Xinyi Ding & Dandong Ge, 2022. "Spatial-Temporal Evolution and Prediction of Carbon Storage: An Integrated Framework Based on the MOP–PLUS–InVEST Model and an Applied Case Study in Hangzhou, East China," Land, MDPI, vol. 11(12), pages 1-22, December.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:17:y:2025:i:13:p:6168-:d:1695199. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.