聚类分析
模式识别(心理学)
计算机科学
平衡(能力)
数据挖掘
人工智能
医学
物理医学与康复
作者
Qinghua Zhang,Yongyang Dai,Guoyin Wang
标识
DOI:10.1016/j.patcog.2022.109052
摘要
• A novel density peaks clustering algorithm called BC-DPC is proposed based on mutual nearest neighbor and connectivity, which can quickly find the correct cluster centers and obtain satisfactory clustering results. • A new definition of density is proposed, which can elimilate the density difference among different clusters and accurately estimate the density of the data points. • This algorithm fully considers the density change and connectivity between data points when calculating the relative distance. This makes the domino effect can be avoided in the clustering process. • The algorithm proposes a novel fast search strategy, which is used to calculate relative distance of data points. This strategy makes the efficiency of the proposed algorithm greatly improved. • The performance of the proposed algorithm is compared to DPC and five improved DPC in synthetic, UCI, and image datasets. • The experimental results show the effectiveness and the efficiency of the proposed algorithm. Density peaks clustering (DPC) algorithm regards the density peaks as the potential cluster centers, and assigns the non-center point into the cluster of its nearest higher-density neighbor. Although DPC can discover clusters with arbitrary shapes, it has some limitations. On the one hand, the density measure of DPC fails to eliminate the density difference among different clusters, which affects the accuracy of recognizing cluster center. On the other hand, the nearest higher-density point is determined without considering connectivity, which leads to continuously clustering errors. Therefore, DPC fails to obtain satisfactory clustering results on datasets with great density difference among clusters. In order to eliminate these limitations, a novel DPC algorithm based on balance density and connectivity (BC-DPC) is proposed. First, the balance density is proposed to eliminate the density difference among different clusters to accurately recognize cluster centers. Second, the connectivity between a data point and its nearest higher-density point is guaranteed by mutual nearest neighbor relationship to avoid continuously clustering errors. Finally, a fast search strategy is proposed to find the nearest higher-density point. The experimental results on synthetic, UCI, and image datasets demonstrate the efficiency and effectiveness of the proposed algorithm in this paper.
科研通智能强力驱动
Strongly Powered by AbleSci AI