特征选择
相互信息
最小冗余特征选择
人工智能
计算机科学
支持向量机
朴素贝叶斯分类器
冗余(工程)
模式识别(心理学)
数据挖掘
依赖关系(UML)
特征(语言学)
机器学习
语言学
操作系统
哲学
作者
Hanchuan Peng,Fuhui Long,Chen Ding
标识
DOI:10.1109/tpami.2005.159
摘要
Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion based on mutual information. Because of the difficulty in directly implementing the maximal dependency condition, we first derive an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection. Then, we present a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers). This allows us to select a compact set of superior features at very low cost. We perform extensive experimental comparison of our algorithm and other methods using three different classifiers (naive Bayes, support vector machine, and linear discriminate analysis) and four different data sets (handwritten digits, arrhythmia, NCI cancer cell lines, and lymphoma tissues). The results confirm that mRMR leads to promising improvement on feature selection and classification accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI