列生成
栏(排版)
聚类分析
解释平方和
约束(计算机辅助设计)
算法
数学优化
数学
计算机科学
统计
几何学
连接(主束)
作者
Antonio M. Sudoso,Daniel Aloise
出处
期刊:Informs Journal on Computing
日期:2025-07-16
标识
DOI:10.1287/ijoc.2024.0938
摘要
The minimum sum-of-squares clustering problem (MSSC), also known as k-means clustering, refers to the problem of partitioning n data points into k clusters, with the objective of minimizing the total sum of squared Euclidean distances between each point and the center of its assigned cluster. We propose an efficient algorithm for solving large-scale MSSC instances, which combines column generation (CG) with dynamic constraint aggregation (DCA) to effectively reduce the number of constraints considered in the CG master problem. DCA was originally conceived to reduce degeneracy in set partitioning problems by utilizing an aggregated restricted master problem obtained from a partition of the set partitioning constraints into disjoint clusters. In this work, we explore the use of DCA within a CG algorithm for MSSC exact solution. Our method is fine-tuned by a series of ablation studies on DCA design choices, and is demonstrated to significantly outperform existing state-of-the-art exact approaches available in the literature. History: Accepted by Andrea Lodi, Design & Analysis of Algorithms–Discrete. Funding: This work was supported by Natural Sciences and Engineering Research Council of Canada [Grant 2023-04466]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2024.0938 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2024.0938 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
科研通智能强力驱动
Strongly Powered by AbleSci AI