期刊:IEEE Systems, Man, and Cybernetics Magazine [Institute of Electrical and Electronics Engineers] 日期:2025-01-01卷期号:11 (1): 23-33
标识
DOI:10.1109/msmc.2024.3443535
摘要
The k-means model and algorithms to optimize it are ubiquitous in cluster analysis. It is impossible to overstate the popularity of this method, which is by far the most heavily cited and studied approach to hard (i.e., non-soft) clustering on earth. This limited tutorial dispels some popularly held misconceptions about this basic method. It begins with a short history of the two algorithms commonly called k-means. Then the geometric structure of hard and soft partition sets underlying all hard, probabilistic, and fuzzy clustering algorithms is presented. The structural theory illuminates some little-known facts about the foundations of k-means and some of its soft relatives. Finally, two soft (probabilistic and fuzzy) generalizations of k-means that should be of interest to practitioners in this area are briefly discussed.