相互信息
特征选择
计算机科学
人工智能
相关性
特征(语言学)
冗余(工程)
机器学习
模式识别(心理学)
相关性(法律)
数据挖掘
数学
哲学
法学
几何学
操作系统
语言学
政治学
作者
Yaojin Lin,Qinghua Hu,Jinghua Liu,Jinjin Li,Xindong Wu
出处
期刊:IEEE Transactions on Fuzzy Systems
[Institute of Electrical and Electronics Engineers]
日期:2017-12-01
卷期号:25 (6): 1491-1507
被引量:141
标识
DOI:10.1109/tfuzz.2017.2735947
摘要
Due to complex semantics, a sample may be associated with multiple labels in various classification and recognition tasks. Multilabel learning generates training models to map feature vectors to multiple labels. There are several significant challenges in multilabel learning. Samples in multilabel learning are usually described with high-dimensional features and some features may be sequentially extracted. Thus, we do not know the full feature set at the beginning of learning, referred to as streaming features. In this paper, we introduce fuzzy mutual information to evaluate the quality of features in multilabel learning, and design efficient algorithms to conduct multilabel feature selection when the feature space is completely known or partially known in advance. These algorithms are called multilabel feature selection with label correlation (MUCO) and multilabel streaming feature selection (MSFS), respectively. MSFS consists of two key steps: online relevance analysis and online redundancy analysis. In addition, we design a metric to measure the correlation between the label sets, and both MUCO and MSFS take label correlation to consideration. The proposed algorithms are not only able to select features from streaming features, but also able to select features for ordinal multilabel learning. However streaming feature selection is more efficient. The proposed algorithms are tested with a collection of multilabel learning tasks. The experimental results illustrate the effectiveness of the proposed algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI