聚类分析
计算机科学
高维数据聚类
维数之咒
CURE数据聚类算法
数据挖掘
相关聚类
数据流聚类
树冠聚类算法
利用
方案(数学)
模糊聚类
人工智能
数学
数学分析
计算机安全
作者
Songjie Xie,Youlong Wu,Kewen Liao,Lu Chen,Chengfei Liu,Haifeng Shen,MingJian Tang,Lu Sun
标识
DOI:10.1109/icde55515.2023.00222
摘要
Recent work has explored federated clustering and developed an efficient k-means based method. However, it is well known that k-means clustering underperforms in high-dimensional space due to the so-called "curse of dimensionality". In addition, high-dimensional data (e.g., generated from healthcare, medical, and biological sectors) are pervasive in the big data era, which poses critical challenges to federated clustering in terms of, but not limited to, clustering effectiveness and communication efficiency. To fill this significant gap in federated clustering, we propose a one-shot federated subspace clustering scheme Fed-SC that can achieve remarkable clustering effectiveness on high-dimensional data while keeping communication cost low using only one round of communication for each local device. We further establish theoretical guarantees on the clustering effectiveness of one-shot Fed-SC and exploit the benefits of statistical heterogeneity across distributed data. Extensive experiments on synthetic and real-world datasets demonstrate significant effectiveness gains of Fed-SC compared with both subspace clustering and one-shot federated clustering methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI