Ultra-DPC: Ultra-scalable and Index-Free Density Peak Clustering

计算机科学 可扩展性 聚类分析 索引(排版) 人工智能 数据库 万维网
作者
Luyao Ma,Geping Yang,Xiang Chen,Yiyang Yang,Zhiguo Gong,Zhifeng Hao
出处
期刊:Lecture Notes in Computer Science 卷期号:: 139-154 被引量:1
标识
DOI:10.1007/978-981-97-2421-5_10
摘要

Density-based clustering is a fundamental and effective tool for recognizing connectivity structure. The density peak, the data object with the maximum density within a predefined sphere, plays a critical role. However, Density Peak Estimation (DPE), the process of identifying the nearest denser relation for each data object, is extremely expensive. The state-of-the-art accelerating solutions that utilize the index are still resource-consuming for large-scale data. In this work, we propose Ultra-DPC, an ultra-scalable and index-free Density Peak Clustering for Euclidean space, to address the challenges above. We theoretically study the correlation between two seemly different clustering algorithms: p-means and density-based clustering, and provide a novel p-means density estimator. Based on this, first, p-means is used on a set of samples S to find a set of p Local Density Peaks (LDP), where $$p \ll N$$ , and N is the number of data objects. Second, so as an informative LDP-wise affinity graph is conducted, and then it is enriched by a Random Walk process to incorporate the clues from the non-LDP objects. Third, the importance of LDP is estimated and the most important ones are chosen as the seeds. Finally, the class memberships of the remaining objects are determined according to their relations to the LDP. Ultra-DPC is the fastest DPE method but without reducing the quality of clustering. The evaluation of different medium- and large-scale datasets demonstrates both the efficiency and effectiveness of Ultra-DPC over the state-of-the-art density-based methods.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
文静新烟应助gcyyyds采纳,获得50
刚刚
tcmz9发布了新的文献求助10
刚刚
刚刚
liss完成签到 ,获得积分10
1秒前
哈哈完成签到,获得积分10
1秒前
sin_phi完成签到,获得积分10
1秒前
欧欧欧欧欧欧阳琴完成签到 ,获得积分10
2秒前
浮游应助AX采纳,获得10
2秒前
llyq66698完成签到,获得积分10
2秒前
浮游应助AX采纳,获得10
2秒前
2秒前
在水一方应助CC晨采纳,获得10
2秒前
橙子完成签到,获得积分10
3秒前
shshjzh发布了新的文献求助10
3秒前
燕临峰发布了新的文献求助10
3秒前
陆拾荒完成签到,获得积分10
4秒前
4秒前
5秒前
SciGPT应助鲤鱼翰采纳,获得10
5秒前
5秒前
5秒前
6秒前
doa完成签到,获得积分20
6秒前
6秒前
高晗完成签到,获得积分10
6秒前
zhengjing完成签到,获得积分10
6秒前
啊啊啊啊完成签到 ,获得积分20
7秒前
zhangzhisenn发布了新的文献求助10
7秒前
8秒前
Three完成签到,获得积分10
8秒前
紫薯发布了新的文献求助10
8秒前
脑洞疼应助凉茶采纳,获得10
8秒前
8秒前
杨雪妮发布了新的文献求助10
8秒前
周艺晨发布了新的文献求助10
9秒前
9秒前
tcmz9完成签到,获得积分10
10秒前
斯文败类应助MZ采纳,获得10
10秒前
云朵0810发布了新的文献求助10
11秒前
忧伤的映阳完成签到 ,获得积分10
11秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
HIGH DYNAMIC RANGE CMOS IMAGE SENSORS FOR LOW LIGHT APPLICATIONS 1500
Constitutional and Administrative Law 1000
The Social Work Ethics Casebook: Cases and Commentary (revised 2nd ed.). Frederic G. Reamer 800
Holistic Discourse Analysis 600
Vertébrés continentaux du Crétacé supérieur de Provence (Sud-Est de la France) 600
Vertebrate Palaeontology, 5th Edition 530
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5351821
求助须知:如何正确求助?哪些是违规求助? 4484784
关于积分的说明 13960373
捐赠科研通 4384451
什么是DOI,文献DOI怎么找? 2408942
邀请新用户注册赠送积分活动 1401489
关于科研通互助平台的介绍 1375007