已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

Some methods for classification and analysis of multivariate observations

数学 一般化 人口 非参数统计 独立性(概率论) 多元统计 集合(抽象数据类型) 分类 样品(材料) 分拆(数论) 功能(生物学) 算法 统计 组合数学 计算机科学 算术 数学分析 社会学 人口学 生物 进化生物学 化学 色谱法 程序设计语言
作者
James B. MacQueen
摘要

The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give partitions which are reasonably efficient in the sense of within-class variance. That is, if p is the probability mass function for the population, S = {S1, S2, * *, Sk} is a partition of EN, and ui, i = 1, 2, * , k, is the conditional mean of p over the set Si, then W2(S) = ff=ISi f z u42 dp(z) tends to be low for the partitions S generated by the method. We say 'tends to be low,' primarily because of intuitive considerations, corroborated to some extent by mathematical analysis and practical computational experience. Also, the k-means procedure is easily programmed and is computationally economical, so that it is feasible to process very large samples on a digital computer. Possible applications include methods for similarity grouping, nonlinear prediction, approximating multivariate distributions, and nonparametric tests for independence among several variables. In addition to suggesting practical classification methods, the study of k-means has proved to be theoretically interesting. The k-means concept represents a generalization of the ordinary sample mean, and one is naturally led to study the pertinent asymptotic behavior, the object being to establish some sort of law of large numbers for the k-means. This problem is sufficiently interesting, in fact, for us to devote a good portion of this paper to it. The k-means are defined in section 2.1, and the main results which have been obtained on the asymptotic behavior are given there. The rest of section 2 is devoted to the proofs of these results. Section 3 describes several specific possible applications, and reports some preliminary results from computer experiments conducted to explore the possibilities inherent in the k-means idea. The extension to general metric spaces is indicated briefly in section 4. The original point of departure for the work described here was a series of problems in optimal classification (MacQueen [9]) which represented special
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
可可完成签到 ,获得积分10
1秒前
找回自己完成签到,获得积分10
4秒前
Lojong完成签到,获得积分10
5秒前
tt完成签到 ,获得积分10
5秒前
Luna爱科研完成签到 ,获得积分10
6秒前
chenmo完成签到,获得积分10
6秒前
10秒前
科研通AI2S应助Lojong采纳,获得10
13秒前
呵呵哒发布了新的文献求助10
13秒前
Splaink完成签到 ,获得积分10
14秒前
跳跃大侠发布了新的文献求助10
16秒前
小巧怀薇完成签到,获得积分10
17秒前
GGBOND完成签到,获得积分10
19秒前
Criminology34应助yang采纳,获得10
19秒前
啾啾完成签到,获得积分10
19秒前
跳跃大侠完成签到,获得积分10
22秒前
22秒前
zkx发布了新的文献求助10
27秒前
Akim应助魔幻的外套采纳,获得30
31秒前
李阳完成签到 ,获得积分10
32秒前
哈基米德举报幻影大师求助涉嫌违规
35秒前
小二郎应助星辰采纳,获得10
39秒前
初始蜜蜂完成签到 ,获得积分10
44秒前
多情的忆之完成签到,获得积分10
44秒前
呵呵哒完成签到,获得积分10
47秒前
共享精神应助丰富如南采纳,获得30
48秒前
momo完成签到,获得积分10
48秒前
Criminology34应助yang采纳,获得10
49秒前
Unicorn完成签到,获得积分10
55秒前
Bob发布了新的文献求助10
57秒前
58秒前
1分钟前
冷静新烟完成签到,获得积分10
1分钟前
sun发布了新的文献求助10
1分钟前
1分钟前
1分钟前
sun完成签到,获得积分10
1分钟前
1分钟前
路过地球完成签到 ,获得积分10
1分钟前
1分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
FUNDAMENTAL STUDY OF ADAPTIVE CONTROL SYSTEMS 500
微纳米加工技术及其应用 500
Nanoelectronics and Information Technology: Advanced Electronic Materials and Novel Devices 500
Performance optimization of advanced vapor compression systems working with low-GWP refrigerants using numerical and experimental methods 500
Constitutional and Administrative Law 500
PARLOC2001: The update of loss containment data for offshore pipelines 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5291051
求助须知:如何正确求助?哪些是违规求助? 4442208
关于积分的说明 13829504
捐赠科研通 4325125
什么是DOI,文献DOI怎么找? 2374008
邀请新用户注册赠送积分活动 1369374
关于科研通互助平台的介绍 1333502