聚类分析
光谱聚类
计算机科学
数据流聚类
CURE数据聚类算法
相关聚类
算法
相似性(几何)
树冠聚类算法
数据挖掘
模糊聚类
单连锁聚类
师(数学)
模式识别(心理学)
数据点
编码(集合论)
高维数据聚类
光谱空间
人工智能
基质(化学分析)
约束聚类
空格(标点符号)
算法设计
构造(python库)
星团(航天器)
源代码
计算复杂性理论
作者
Dongdong Cheng,Xiaocui Jiang,Shuyin Xia,Guoyin Wang,Jinlong Huang,Sulan Zhang,Yi Wang
标识
DOI:10.1109/tkde.2026.3672824
摘要
Although spectral clustering is capable of identifying clusters of arbitrary shapes, its high time and space complexity poses limitations in large-scale data clustering applications. To tackle this problem, researchers have proposed using anchor points to construct the similarity matrix, thereby reducing time and space complexity. However, current methods for generating anchor points do not fit the data well and are limited in approach. To improve upon existing anchor points generation methods, we proposes a pseudo-label-based anchor points generation approach and develops a fast spectral clustering algorithm for large-scale data, named FSC-PLGB. The algorithm first randomly selects r points as an initial granular-ball, applies K-Means on these points to obtain pseudo-labels, calculates the pseudo-purity of the granular-ball based on these pseudo labels, and then performs granular-ball division based on these pseudo-purity to generate anchor points. A similarity matrix is constructed between all sample points and anchor points, and finally, spectral clustering is applied to obtain the clustering results. The experimental results demonstrate that our proposed algorithm exhibits exceptional efficiency and significant superiority on large-scale datasets. The source code is available at https://github.com/DongdongCheng/FSC-PLGB.
科研通智能强力驱动
Strongly Powered by AbleSci AI