计算机科学
特征选择
相互信息
皮尔逊积矩相关系数
模式识别(心理学)
支持向量机
人工智能
冗余(工程)
相关系数
相关性
滤波器(信号处理)
数据挖掘
算法
机器学习
数学
统计
几何学
计算机视觉
操作系统
作者
Huanhuan Gong,Yanying Li,Jiaoni Zhang,Baoshuang Zhang,Xialin Wang
标识
DOI:10.1016/j.engappai.2024.107865
摘要
Feature selection is widely used in various fields as a key means of data dimension reduction. The existing feature selection algorithms only use one linear or nonlinear correlation indicator when evaluating variables relationships, which lacks diversity. Considering the complexity of the relationship between features, a novel feature selection evaluation function CONMI is constructed, which ensembles Pearson correlation coefficient (liner) and normalized mutual information (non-linear) to comprehensively portrays the dependencies between features and class variables. We further propose the CONMI_FS algorithm based on CONMI, which selects the optimal subset of features that has high correlation with the class variables and low redundancy between the selected features. CONMI_FS is compared with four methods on 20 datasets and evaluated by reduction rate, classification accuracy, precision and recall metrics on KNN, SVM and DT classifiers. The experimental results show that CONMI_FS obtains the highest reduction rate of 80.04%, and achieves the best classification accuracy on KNN and SVM classifiers, which are 88.83% and 88.98%, respectively. These results indicate that CONMI_FS has good competitiveness.
科研通智能强力驱动
Strongly Powered by AbleSci AI