数据库扫描
计算机科学
聚类分析
功能(生物学)
数据挖掘
计算机安全
人工智能
模糊聚类
CURE数据聚类算法
进化生物学
生物
作者
Jiaxuan Fu,Ke Cheng,Anxiao Song,Yuheng Xia,Zhao Chang,Yulong Shen
标识
DOI:10.1109/tifs.2024.3446233
摘要
Density-based clustering algorithms such as DBSCAN, are highly effective in handling large datasets and identifying clusters of arbitrary shapes, playing a crucial role in data analysis fields like outlier detection and social networks. Outsourcing DBSCAN to the cloud brings substantial benefits but also raises major privacy concerns regarding the private input data of data owners. Existing private DBSCAN methods often face challenges of inefficiency or potential privacy leakage, hindering their practical deployment. To address these challenges, we introduce FSS-DBSCAN, a three-server MPC platform designed for outsourced private density-based clustering using function secret sharing (FSS). This solution guarantees clustering quality equivalent to plaintext algorithms, ensures comprehensive privacy protection, and achieves top-tier efficiency. The high performance of FSS-DBSCAN is driven by two pivotal strategies. First, we devise an MPC-friendly DBSCAN algorithm that is highly compatible with efficient secret-sharing-based cryptographic protocols and benefits from GPU acceleration. Second, we construct novel FSS-based protocols tailored for complex operations integral to our DBSCAN variant, such as Euclidean distance comparison and point assignment, and further optimize their computation through tensorization techniques. We implement our platform as an extensible system on top of PyTorch that leverages GPU hardware acceleration for cryptographic and tensorized operations. These innovations enable FSS-DBSCAN to significantly outperform ppDBSCAN (AsiaCCS 2021), reducing the clustering time for 5000 samples to approximately 2 hours, achieving an $83.4\times $ speed improvement.
科研通智能强力驱动
Strongly Powered by AbleSci AI