模式识别(心理学)
k-最近邻算法
特征选择
人工智能
计算机科学
选择(遗传算法)
特征(语言学)
最近邻链算法
数据挖掘
粗集
数学
聚类分析
模糊聚类
哲学
语言学
树冠聚类算法
作者
Weihua Xu,Yang Zhang,Yuhua Qian
标识
DOI:10.1109/tnnls.2024.3460796
摘要
Large amounts of high-dimensional unlabeled data typically contain only a small portion of truly effective information. Consequently, the issue of unsupervised feature selection methods has gained significant attention in research. However, current unsupervised feature selection approaches face limitations when dealing with datasets that exhibit uneven density, and they also require substantial computational time. To address this problem, this research article proposes a feature extraction technique that combines the Fuzzy C-Means (FCM) and k -nearest neighbor rough sets. FCM is a clustering algorithm grounded in fuzzy theory, which takes into account the inherent data structure and the correlations between different features. Consequently, FCM is particularly well-suited for datasets with uneven density. Our proposed method consists of three steps. First, the FCM algorithm is used to cluster the unlabeled data. Second, a measure that evaluates the importance of features is defined and sorted based on the clustering results. Finally, redundant features are filtered using k -nearest neighbor rough sets while retaining important features, significantly reducing the running time. In addition, we designed the feature selection algorithm (KND-UFS) and conducted experiments on 12 public datasets. We compared KND-UFS with eight existing algorithms in terms of running time, classification accuracy, and the number of selected features. The experimental results provided strong evidence supporting the superior performance of the KND-UFS algorithm.
科研通智能强力驱动
Strongly Powered by AbleSci AI