多标签分类
人工智能
计算机科学
二元分类
二进制数
模式识别(心理学)
分类器(UML)
班级(哲学)
机器学习
不相关
特征选择
集合(抽象数据类型)
数学
支持向量机
统计
程序设计语言
算术
作者
Jun Huang,Guorong Li,Qingming Huang,Xindong Wu
摘要
Binary relevance (BR) is a well-known framework for multi-label classification. It decomposes multi-label classification into binary (one-vs-rest) classification subproblems, one for each label. The BR approach is a simple and straightforward way for multi-label classification, but it still has several drawbacks. First, it does not consider label correlations. Second, each binary classifier may suffer from the issue of class-imbalance. Third, it can become computationally unaffordable for data sets with many labels. Several remedies have been proposed to solve these problems by exploiting label correlations between labels and performing label space dimension reduction. Meanwhile, inconsistency, another potential drawback of BR, is often ignored by researchers when they construct multi-label classification models. Inconsistency refers to the phenomenon that if an example belongs to more than one class label, then during the binary training stage, it can be considered as both positive and negative example simultaneously. This will mislead binary classifiers to learn suboptimal decision boundaries. In this paper, we seek to solve this problem by learning label specific features for each label. We assume that each label is only associated with a subset of features from the original feature set, and any two strongly correlated class labels can share more features with each other than two uncorrelated or weakly correlated ones. The proposed method can be applied as a feature selection method for multi-label learning and a general strategy to improve multi-label classification algorithms comprising a number of binary classifiers. Comparison with the state-of-the-art approaches manifests competitive performance of our proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI