相互信息
特征选择
特征(语言学)
数据挖掘
计算机科学
交互信息
独立性(概率论)
模式识别(心理学)
集合(抽象数据类型)
点态互信息
相关性(法律)
人工智能
选择(遗传算法)
条件互信息
机器学习
数学
统计
哲学
语言学
政治学
法学
程序设计语言
作者
Ping Zhang,Guixia Liu,Jiazhi Song
标识
DOI:10.1016/j.patcog.2023.109378
摘要
Multi-label feature selection captures a reliable and informative feature subset from high-dimensional multi-label data, which plays an important role in pattern recognition. In conventional information-theoretical based multi-label feature selection methods, the high-order feature relevance between feature and label set is evaluated using low-order mutual information. However, existing methods do not establish the theoretical basis for the low-order approximation. To fill this gap, we first identify two underlying assumptions based on high-order label distribution: Label Independence Assumption (LIA) and Paired-label Independence Assumption (PIA). Second, we systematically analyze the strengths and weaknesses of two assumptions and introduce joint mutual information to satisfy more realistic label distribution. Furthermore, by decomposing joint mutual information, an interaction weight is proposed to consider multiple label correlations. Finally, a new method considering join mutual information and interaction weight is proposed. Comprehensive experiments demonstrate the effectiveness of the proposed method on various evaluation metrics.
科研通智能强力驱动
Strongly Powered by AbleSci AI