作者
Jiani Yang,Sina Hasheminassab,Meredith Franklin,Antong Zhang,David J. Diner,Joseph Pinto,Yuk L. Yung
摘要
Fine particulate matter (PM 2.5 , particulate matter with an aerodynamic diameter ≤2.5 μm) poses major public health and environmental risks, yet the toxicity of its chemical components remains poorly understood due to limited chemical speciation data. In this study we apply an extreme gradient boosting (XGBoost) machine learning framework to predict key PM 2.5 components including organic carbon, elemental carbon, nitrate, sulfate, ammonium, and metals, using readily available predictors: total PM 2.5 mass concentrations, meteorological variables, trace gas measurements, and indicators of exceptional events (e.g., wildfires, fireworks). Leveraging a decade of data from two monitoring sites in Southern California (Los Angeles and Rubidoux), the models achieved strong predictive performance, particularly for nitrate, ammonium, and elemental carbon. Among the most influential predictors across components were total PM 2.5 mass, relative humidity, and boundary layer height. This approach has promise for enhancing satellite remote sensing applications, improving chemical transport model inputs, and generating cost-effective estimates of PM 2.5 components during sampling gaps and in regions lacking frequent monitoring. Further research is needed to assess the generalizability of this framework across diverse geographic and climatic settings. • Machine learning models accurately predict daily PM 2.5 chemical components • Nitrate, ammonium, and organic carbon show the highest predictive performance • Relative humidity, PM 2.5 mass, and NO 2 are key predictors identified by SHAP • The framework addresses data gaps in chemical speciation monitoring networks • Results support satellite applications and cost-effective air quality assessment