启发式
聚类分析
决策树
功能(生物学)
树(集合论)
计算机科学
简单(哲学)
多项式的
数学
时间复杂性
算法
组合数学
数据挖掘
数学优化
人工智能
数学分析
哲学
认识论
进化生物学
生物
作者
Eduardo Sany Laber,Lucas Murtinho,Felipe Leite de Oliveira
出处
期刊:Cornell University - arXiv
日期:2021-12-29
标识
DOI:10.48550/arxiv.2112.14718
摘要
A number of recent works have employed decision trees for the construction of explainable partitions that aim to minimize the $k$-means cost function. These works, however, largely ignore metrics related to the depths of the leaves in the resulting tree, which is perhaps surprising considering how the explainability of a decision tree depends on these depths. To fill this gap in the literature, we propose an efficient algorithm that takes into account these metrics. In experiments on 16 datasets, our algorithm yields better results than decision-tree clustering algorithms such as the ones presented in \cite{dasgupta2020explainable}, \cite{frost2020exkmc}, \cite{laber2021price} and \cite{DBLP:conf/icml/MakarychevS21}, typically achieving lower or equivalent costs with considerably shallower trees. We also show, through a simple adaptation of existing techniques, that the problem of building explainable partitions induced by binary trees for the $k$-means cost function does not admit an $(1+\epsilon)$-approximation in polynomial time unless $P=NP$, which justifies the quest for approximation algorithms and/or heuristics.
科研通智能强力驱动
Strongly Powered by AbleSci AI