均方误差
随机森林
环境科学
预测建模
回归分析
统计
回归
比例(比率)
机器学习
气象学
大气科学
数学
计算机科学
地理
地图学
地质学
作者
Gongbo Chen,Shanshan Li,Luke D. Knibbs,Nicholas Hamm,Wei Cao,Tiantian Li,Jianping Guo,Hongyan Ren,Michael J. Abramson,Yuming Guo
标识
DOI:10.1016/j.scitotenv.2018.04.251
摘要
Abstract Background Machine learning algorithms have very high predictive ability. However, no study has used machine learning to estimate historical concentrations of PM2.5 (particulate matter with aerodynamic diameter ≤ 2.5 μm) at daily time scale in China at a national level. Objectives To estimate daily concentrations of PM2.5 across China during 2005–2016. Methods Daily ground-level PM2.5 data were obtained from 1479 stations across China during 2014–2016. Data on aerosol optical depth (AOD), meteorological conditions and other predictors were downloaded. A random forests model (non-parametric machine learning algorithms) and two traditional regression models were developed to estimate ground-level PM2.5 concentrations. The best-fit model was then utilized to estimate the daily concentrations of PM2.5 across China with a resolution of 0.1° (≈10 km) during 2005–2016. Results The daily random forests model showed much higher predictive accuracy than the other two traditional regression models, explaining the majority of spatial variability in daily PM2.5 [10-fold cross-validation (CV) R2 = 83%, root mean squared prediction error (RMSE) = 28.1 μg/m3]. At the monthly and annual time-scale, the explained variability of average PM2.5 increased up to 86% (RMSE = 10.7 μg/m3 and 6.9 μg/m3, respectively). Conclusions Taking advantage of a novel application of modeling framework and the most recent ground-level PM2.5 observations, the machine learning method showed higher predictive ability than previous studies. Capsule Random forests approach can be used to estimate historical exposure to PM2.5 in China with high accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI