特征选择
计算机科学
冗余(工程)
最小冗余特征选择
分类
数据挖掘
模式识别(心理学)
特征(语言学)
维数之咒
特征提取
人工智能
数据冗余
降维
算法
数据库
语言学
哲学
操作系统
作者
Chaolu Meng,Ye Yuan,Haiyan Zhao,Yue Pei,Zhi Li
标识
DOI:10.1016/j.compbiomed.2023.107654
摘要
Discrete features can be obtained from protein sequences using a feature extraction method. These features are the basis of downstream processing of protein data, but it is necessary to screen and select some important features from them as they generally have data redundancy. Here, we report IIFS, an improved incremental feature selection method that exploits a new subset search strategy to find the optimal feature set. IIFS combines nonadjacent sorting features to prevent the drawbacks of data explosion and excessive reliance on feature sorting results. The comparative experimental results on 27 feature sorting data show that IIFS can find more accurate and important features compared to existing methods.The IIFS approach also handles data redundancy more efficiently and finds more representative and discriminatory features while ensuring minimal feature dimensionality and good evaluation metrics. Moreover, we wrap this method and deploy it on a web server for access at http://112.124.26.17:8005/.
科研通智能强力驱动
Strongly Powered by AbleSci AI