计算机科学
特征选择
成对比较
相互信息
相关性
冗余(工程)
水准点(测量)
数据挖掘
选择(遗传算法)
特征(语言学)
人工智能
模式识别(心理学)
相关性(法律)
机器学习
情报检索
数学
法学
地理
哲学
几何学
操作系统
语言学
政治学
大地测量学
作者
Dianlong You,Yang Wang,Jiawei Xiao,Yaojin Lin,Maosheng Pan,Zhen Chen,Limin Shen,Xindong Wu
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2023-03-01
卷期号:35 (3): 2901-2915
被引量:15
标识
DOI:10.1109/tkde.2021.3113514
摘要
Multi-label streaming feature selection has attracted extensive attention in diverse big data applications. However, most existing works focused on the scenarios where labels are independent, while ignoring the real scenarios that they may be interdependent and correlated with each other. This paper aims to fill this gap by developing a novel online multi-label streaming feature selection scheme by taking into account the existence of label correlation, known as (OMSFS LC ). In our design, we first calculate the correlation degree between labels to obtain the label weight. Then, we integrate the mutual information and the label weight to evaluate the correlation between features and labels. In particular, it consists of three stages: 1) online significance analysis, which can determine the significant features via the correlation degree between the newly arriving features and labels; 2) online relevance analysis, which can obtain relevant features via the mutual information; and 3) online redundancy analysis, which can filter the redundant features for removal via pairwise comparison. We implement our solution and conduct extensive experiments on benchmark datasets for performance evaluations. The experimental results exhibit that OMSFS LC significantly outperforms the state-of-the-art methods in terms of effectiveness and efficiency.
科研通智能强力驱动
Strongly Powered by AbleSci AI