Ultra-DPC: Ultra-scalable and Index-Free Density Peak Clustering

计算机科学 可扩展性 聚类分析 索引(排版) 人工智能 数据库 万维网
作者
Luyao Ma,Geping Yang,Xiang Chen,Yiyang Yang,Zhiguo Gong,Zhifeng Hao
出处
期刊:Lecture Notes in Computer Science 卷期号:: 139-154 被引量:1
标识
DOI:10.1007/978-981-97-2421-5_10
摘要

Density-based clustering is a fundamental and effective tool for recognizing connectivity structure. The density peak, the data object with the maximum density within a predefined sphere, plays a critical role. However, Density Peak Estimation (DPE), the process of identifying the nearest denser relation for each data object, is extremely expensive. The state-of-the-art accelerating solutions that utilize the index are still resource-consuming for large-scale data. In this work, we propose Ultra-DPC, an ultra-scalable and index-free Density Peak Clustering for Euclidean space, to address the challenges above. We theoretically study the correlation between two seemly different clustering algorithms: p-means and density-based clustering, and provide a novel p-means density estimator. Based on this, first, p-means is used on a set of samples S to find a set of p Local Density Peaks (LDP), where $$p \ll N$$ , and N is the number of data objects. Second, so as an informative LDP-wise affinity graph is conducted, and then it is enriched by a Random Walk process to incorporate the clues from the non-LDP objects. Third, the importance of LDP is estimated and the most important ones are chosen as the seeds. Finally, the class memberships of the remaining objects are determined according to their relations to the LDP. Ultra-DPC is the fastest DPE method but without reducing the quality of clustering. The evaluation of different medium- and large-scale datasets demonstrates both the efficiency and effectiveness of Ultra-DPC over the state-of-the-art density-based methods.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
刚刚
1秒前
yiyao发布了新的文献求助10
1秒前
鱼王木木完成签到,获得积分10
1秒前
wgm1104发布了新的文献求助10
1秒前
aaa完成签到,获得积分10
1秒前
白兰鸽完成签到,获得积分10
1秒前
lysixsixsix完成签到,获得积分10
2秒前
qq发布了新的文献求助10
2秒前
nimo完成签到,获得积分10
2秒前
jie367完成签到,获得积分20
2秒前
oasissmz完成签到,获得积分10
2秒前
Mengg发布了新的文献求助20
2秒前
2秒前
无奈皮卡丘完成签到 ,获得积分10
3秒前
kkx发布了新的文献求助10
3秒前
LIVE完成签到,获得积分10
3秒前
超帅沂完成签到,获得积分10
3秒前
雪艇发布了新的文献求助10
3秒前
黎明发布了新的文献求助10
4秒前
4秒前
张涛完成签到,获得积分20
5秒前
清水涧完成签到,获得积分10
5秒前
5秒前
岂识浊醪妙理完成签到,获得积分10
5秒前
Dr_Shi发布了新的文献求助10
6秒前
方SS发布了新的文献求助10
6秒前
Lee发布了新的文献求助10
7秒前
miles完成签到,获得积分10
7秒前
liuliu完成签到,获得积分10
9秒前
9秒前
9秒前
糖糖钰完成签到,获得积分10
10秒前
脑洞疼应助秀丽的人采纳,获得30
10秒前
10秒前
糖葫芦完成签到,获得积分10
11秒前
俊逸惜蕊发布了新的文献求助10
11秒前
12秒前
12秒前
高分求助中
ISCN 2024 - An International System for Human Cytogenomic Nomenclature (2024) 3000
Les Mantodea de Guyane Insecta, Polyneoptera 2500
One Man Talking: Selected Essays of Shao Xunmei, 1929–1939 (PDF!) 1000
Technologies supporting mass customization of apparel: A pilot project 450
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3789000
求助须知:如何正确求助?哪些是违规求助? 3334088
关于积分的说明 10267170
捐赠科研通 3050312
什么是DOI,文献DOI怎么找? 1673974
邀请新用户注册赠送积分活动 802379
科研通“疑难数据库(出版商)”最低求助积分说明 760570