过度拟合
阿卡克信息准则
计算机科学
校准
特征(语言学)
空气质量指数
随机森林
特征选择
非线性系统
选择(遗传算法)
人工智能
机器学习
数据挖掘
数学
统计
地理
物理
量子力学
人工神经网络
哲学
气象学
语言学
作者
Yuxiang Lin,Wei Dong,Yuan Chen
摘要
Urban air quality information, e.g., PM2.5 concentration, is of great importance to both the government and society. Recently, there is a growing interest in developing low-cost sensors, installed on moving vehicles, for fine-grained air quality measurement. However, low-cost mobile sensors typically suffer from low accuracy and thus need careful calibration to preserve a high measurement quality. In this paper, we propose a two-phase data calibration method consisting of a linear part and a nonlinear part. We use MLS (multiple least square) to train the linear part, and use RF (random forest) to train the nonlinear part. We propose an automatic feature selection algorithm based on AIC (Akaike information criterion) for the linear model, which helps avoid overfitting due to the inclusion of inappropriate features. We evaluate our method extensively. Results show that our method outperforms existing approaches, achieving an overall accuracy improvement of 16.4% in terms of PM2.5 levels compared with state-of-the-art approach.
科研通智能强力驱动
Strongly Powered by AbleSci AI