聚类分析
二元分析
随机森林
数据挖掘
朴素贝叶斯分类器
空间分析
计算机科学
分摊
环境科学
机器学习
统计
数学
支持向量机
政治学
法学
作者
Guoxin Huang,Xiahui Wang,Di Chen,Yipeng Wang,Shizheng Zhu,Tao Zhang,Lei Liao,Zi Tian,Nan Wei
标识
DOI:10.1016/j.jhazmat.2022.129324
摘要
The efficacy of source apportionment is often limited by a lack of information on natural and anthropogenic contributing factors influencing soil heavy metal (HM) contaminations. To overcome this limitation and develop the data mining methods, a novel hybrid data-driven framework was proposed to diagnose the contributing factors in an industrialized region in Guangdong Province, China, mainly using a combination of naive Bayes (NB), random forest (RF), and bivariate local Moran's I (BLMI) on the basis of the multi-source big data. The medium industry types of enterprises from the freely available Baidu point of interest data were successfully classified, and then the 250 contaminating enterprises as a contributing factor were identified by the optimized NB classifier. The quantitative contributions of the nine contributing factors for the As, Cd, and Hg concentrations were determined by the optimized RF. The twelve spatial clustering maps between the three HM concentrations and the four key contributing factors were generated by BLMI, explicitly revealing their mutual interactions and internal effects and also intuitively showing the "high-high" areas and their distributions. This framework can obtain rich information on contributing factors such as medium industry types, contribution rates, spatial clusters, and spatial distributions.
科研通智能强力驱动
Strongly Powered by AbleSci AI