特征选择
计算机科学
熵(时间箭头)
数据挖掘
模糊逻辑
人工智能
联合熵
机器学习
最大熵原理
量子力学
物理
作者
Chucai Zhang,Zhengxiang Lu,Yongkang Zhang,Jianhua Dai
标识
DOI:10.1109/tfuzz.2025.3529466
摘要
Online streaming feature selection has garnered widespread attention due to its efficiency and adaptability in dynamic data environments. However, existing methods primarily focus on the correlation and redundancy among features, often overlooking the complementarity between candidate and selected features. In this paper, we address this gap by introducing three key innovations. First, we construct a novel metric, fuzzy Gini entropy (FGE), to measure feature uncertainty within datasets. Unlike traditional information entropy, fuzzy Gini entropy inherits the advantages of the Gini index, effectively measuring the impurity of datasets, while also being capable of handling common fuzzy environments. Accordingly, related metrics such as fuzzy joint Gini entropy, fuzzy conditional Gini entropy, and fuzzy mutual Gini information are developed. Second, we innovatively propose the concept of the bidirectional complementarity ratio (BCR), which captures the relationship between candidate features and previously selected features in online streaming feature selection. This mitigates the unfairness associated with the late arrival of features, ensuring that candidate features with a bidirectional complementary effect that outweighs their redundancy effect with the selected features are chosen. Third, we design an online streaming feature selection method named FGE-OSFS. The method evaluates streaming features through three steps: online relevance analysis, online bidirectional complementarity analysis, and online redundancy analysis. Finally, we compare the proposed method with five state-of-the-art online streaming feature selection methods, demonstrating the effectiveness of our new approach.
科研通智能强力驱动
Strongly Powered by AbleSci AI