聚类分析
比例(比率)
计算机科学
光谱聚类
数据挖掘
人工智能
模式识别(心理学)
地理
地图学
作者
Wen Zhang,Jiangpeng Zhao,Lean Yu,Song Wang
标识
DOI:10.1109/tnnls.2025.3571473
摘要
Anchor-based methods are proposed to make use of anchors to produce an affinity matrix of objects to improve the scalability of traditional spectral clustering (SC). Nevertheless, the membership heterogeneity of objects inside a cluster, which would bring about low quality of anchors and hurt the clustering accuracy, is commonly neglected by existing anchor-based algorithms. To address this problem, this article proposes a novel approach to adopt the Gaussian mixture model (GMM) to enhance anchor-based SC for large-scale data in a two-stage divide-and-conquer manner. In the first stage, GMM with expectation maximization (EM) algorithm is employed to divide the objects into two categories as prior-consistent objects and prior-uncertain objects in considering the membership heterogeneity of objects. In the second stage, anchor-based SC is conducted on the prior-uncertain objects by sampling the anchors from the Gaussian components derived from the first stage. Then, the produced clusters in the second stage are aligned with those Gaussian components by maximizing the membership of objects with respect to clusters. The computation complexity of the proposed GMM-SC approach is much smaller than that of the anchor-based SC. The experiments on large-scale datasets also validate the superiority of the proposed GMM-SC approach over state-of-the-art techniques.
科研通智能强力驱动
Strongly Powered by AbleSci AI