还原
粗集
计算机科学
特征选择
人工智能
模式识别(心理学)
启发式
预处理器
数据预处理
封面(代数)
特征(语言学)
决策表
集合(抽象数据类型)
数据挖掘
算法
工程类
语言学
程序设计语言
哲学
机械工程
作者
Nguyễn Ngọc Thủy,Sartra Wongthanavasu
标识
DOI:10.1109/tnnls.2020.3048080
摘要
Attribute reduction, also called feature selection, is one of the most important issues of rough set theory, which is regarded as a vital preprocessing step in pattern recognition, machine learning, and data mining. Nowadays, high-dimensional mixed and incomplete data sets are very common in real-world applications. Certainly, the selection of a promising feature subset from such data sets is a very interesting, but challenging problem. Almost all of the existing methods generated a cover on the space of objects to determine important features. However, some tolerance classes in the cover are useless for the computational process. Thus, this article introduces a new concept of stripped neighborhood covers to reduce unnecessary tolerance classes from the original cover. Based on the proposed stripped neighborhood cover, we define a new reduct in mixed and incomplete decision tables, and then design an efficient heuristic algorithm to find this reduct. For each loop in the main loop of the proposed algorithm, we use an error measure to select an optimal feature and put it into the selected feature subset. Besides, to deal more efficiently with high-dimensional data sets, we also determine redundant features after each loop and remove them from the candidate feature subset. For the purpose of verifying the performance of the proposed algorithm, we carry out experiments on data sets downloaded from public data sources to compare with existing state-of-the-art algorithms. Experimental results showed that our algorithm outperforms compared algorithms, especially in classification accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI