作者
Jingxin Yu,Wengang Zheng,Linlin Xu,Fanyu Meng,Jing Li,Lili Zhangzhong
摘要
• An adaptive TPE-CatBoost method for soil moisture spatial estimation was proposed. • Optimal accuracy was obtained by accounting for soil, meteorological, and location covariates. • The AI algorithm performed better than the traditional GIS spatial interpolation algorithm. • The proposed model was found to be susceptible to environmental changes of covariates. Maize is one of the major crops in China. The soil water content (SWC) in the root zone of maize is a critical indicator that guides agricultural production decisions and can affect national food security. However, a daily-scale, high-precision, spatial estimation method for SWC in China's main maize-producing areas has not been well researched. Therefore, we developed a spatial estimation model for SWC with a dynamic parameter optimization mechanism termed TPE-CatBoost. It combines the CatBoost algorithm as the core fitting framework with the efficient tree-structured Parzen estimator (TPE) algorithm to achieve a dynamic hyperparameter optimization based on covariate characteristics. Daily measured multi-depth SWC data at 175 stations from 2015 to 2019 were used as the reference truth, and 18 items of information, including soil physical and chemical properties, daily meteorological conditions, and spatial location information, were obtained from Google Earth Engine and considered as covariates. Model training was performed using the leave-one-out cross-validation method. Estimation error differences were investigated in four dimensions: time, space, depth, and the model. Our key results are as follows: (1) by combining all covariates, the highest estimation accuracy could be obtained at any soil depth, with a mean absolute error (MAE) within [6.06%, 6.94%]. The top five mean importance scores of covariates were latitude, soil pH, bulk density, DEM, and dewpoint temperature; (2) the MAE for all years remained within [4.66%, 9.34%], with higher errors in June; (3) the MAE for each province remained within [3.5%, 8.29%], with errors decreasing from north to south; and (4) compared with GIS-based spatial interpolation methods (inverse distance weighted, ordinary Kriging, and empirical Bayesian Kriging [EBK]), artificial intelligence (AI) algorithms combining environmental covariates (XGBoost, CatBoost, and TPE-CatBoost) could achieve better estimation accuracy. In particular, TPE-CatBoost performed well, with an improvement of 15.1% over EBK. We also demonstrated that TPE-CatBoost was susceptible to changes in the covariate gain capacity under extreme weather conditions using the SHapley Additive exPlanation (SHAP) algorithm. Visual mapping of single-day spatial estimation results in ArcGIS showed high consistency in distribution trends compared with the Soil Moisture Active Passive (SMAP) product.