聚类分析
模式识别(心理学)
特征(语言学)
特征选择
人工智能
计算机科学
相似性(几何)
最小冗余特征选择
火焰团簇
相关聚类
单连锁聚类
数据挖掘
特征向量
CURE数据聚类算法
图像(数学)
哲学
语言学
作者
Xiaoyan Zhu,Yu Wang,Yingbin Li,Yonghui Tan,Guangtao Wang,Qinbao Song
摘要
Abstract Unsupervised feature selection is an important problem, especially for high‐dimensional data. However, until now, it has been scarcely studied and the existing algorithms cannot provide satisfying performance. Thus, in this paper, we propose a new unsupervised feature selection algorithm using similarity‐based feature clustering, Feature Selection‐based Feature Clustering (FSFC). FSFC removes redundant features according to the results of feature clustering based on feature similarity. First, it clusters the features according to their similarity. A new feature clustering algorithm is proposed, which overcomes the shortcomings of K‐means. Second, it selects a representative feature from each cluster, which contains most interesting information of features in the cluster. The efficiency and effectiveness of FSFC are tested upon real‐world data sets and compared with two representative unsupervised feature selection algorithms, Feature Selection Using Similarity (FSUS) and Multi‐Cluster‐based Feature Selection (MCFS) in terms of runtime, feature compression ratio, and the clustering results of K‐means. The results show that FSFC can not only reduce the feature space in less time, but also significantly improve the clustering performance of K‐means.
科研通智能强力驱动
Strongly Powered by AbleSci AI