随机森林
特征选择
计算机科学
查德
支持向量机
特征(语言学)
字错误率
人工智能
模式识别(心理学)
选择(遗传算法)
数据挖掘
维数(图论)
机器学习
决策树
数学
语言学
哲学
纯数学
作者
Zhuo Wang,Huan Li,Bin Nie,Jianqiang Du,Yuwen Du,Yufeng Chen
标识
DOI:10.1109/icceai52939.2021.00062
摘要
Aiming at the dimensional disaster and over-fitting problems in data analysis, this paper proposes a feature selection method using hybrid integration of difference models and random forests (Integrate-RF), firstly, Integrate-RF use CART, CHAID, SVM, BN, NN, K-Means, Kohonen to evaluate the importance of features, and then, for the above seven sorts, Integrate-RF use the arithmetic average method to calculate the importance of the features; secondly, Integrate-RF select the most important features from the remaining features into features subset, and use random forest classification to get the corresponding out-of-bag(OOB) data classification error rate; finally, the optimal features subset can be selected based on the OOB data classification error rate. Experiments show that feature selection methods proposed in this paper effectively reduces the data dimension, selects features better and more adaptable.
科研通智能强力驱动
Strongly Powered by AbleSci AI