作者
Chao Wang,Conghe Song,Todd A. Schroeder,Curtis E. Woodcock,Tamlin M. Pavelsky,Qianqian Han,Fangfang Yao
摘要
Accurately monitoring forest canopy height is crucial for sustainable forest management, particularly in southeastern North Carolina, USA, where dense forests and limited accessibility pose substantial challenges. This study presents an explainable machine learning framework that integrates sparse GEDI LiDAR samples with multi-sensor remote sensing data to improve both the accuracy and interpretability of forest canopy height estimation. This framework incorporates multitemporal optical observations from Sentinel-2; C-band backscatter and InSAR coherence from Sentinel-1; quad-polarization L-Band backscatter and polarimetric decompositions from the Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR); texture features from the National Agriculture Imagery Program (NAIP) aerial photography; and topographic data derived from an airborne LiDAR-based digital elevation model. We evaluated four machine learning algorithms, K-nearest neighbors (KNN), random forest (RF), support vector machine (SVM), and eXtreme gradient boosting (XGB), and found consistent accuracy across all models. Our evaluation highlights our method’s robustness, evidenced by closely matched R2 and RMSE values across models: KNN (R2 of 0.496, RMSE of 5.13 m), RF (R2 of 0.510, RMSE of 5.06 m), SVM (R2 of 0.544, RMSE of 4.88 m), and XGB (R2 of 0.548, RMSE of 4.85 m). The integration of comprehensive feature sets, as opposed to subsets, yielded better results, underscoring the value of using multisource remotely sensed data. Crucially, SHapley Additive exPlanations (SHAP) revealed the multi-seasonal red-edge spectral bands of Sentinel-2 as dominant predictors across models, while volume scattering from UAVSAR emerged as a key driver in tree-based algorithms. This study underscores the complementary nature of multi-sensor data and highlights the interpretability of our models. By offering spatially continuous, high-quality canopy height estimates, this cost-effective, data-driven approach advances large-scale forest management and environmental monitoring, paving the way for improved decision-making and conservation strategies.