Boosting(机器学习)
特征选择
计算机科学
最小冗余特征选择
加权
冗余(工程)
人工智能
机器学习
数据挖掘
模式识别(心理学)
水准点(测量)
特征(语言学)
哲学
放射科
操作系统
医学
语言学
地理
大地测量学
作者
Ahmad Alsahaf,Nicolai Petkov,Veena Shenoy,George Azzopardi
标识
DOI:10.1016/j.eswa.2021.115895
摘要
As dimensions of datasets in predictive modelling continue to grow, feature selection becomes increasingly practical. Datasets with complex feature interactions and high levels of redundancy still present a challenge to existing feature selection methods. We propose a novel framework for feature selection that relies on boosting, or sample re-weighting, to select sets of informative features in classification problems. The method uses as its basis the feature rankings derived from fast and scalable tree-boosting models, such as XGBoost. We compare the proposed method to standard feature selection algorithms on 9 benchmark datasets. We show that the proposed approach reaches higher accuracies with fewer features on most of the tested datasets, and that the selected features have lower redundancy.
科研通智能强力驱动
Strongly Powered by AbleSci AI