缺少数据
特征选择
计算机科学
特征(语言学)
数据挖掘
选择(遗传算法)
流式数据
因子(编程语言)
质量(理念)
特征向量
机器学习
人工智能
语言学
认识论
哲学
程序设计语言
作者
Di Wu,Yi He,Xin Luo,MengChu Zhou
标识
DOI:10.1109/tsmc.2021.3096065
摘要
Online streaming feature selection (OSFS) has attracted extensive attention during the past decades. Current approaches commonly assume that the feature space of fixed data instances dynamically increases without any missing data. However, this assumption does not always hold in many real applications. Motivated by this observation, this study aims to implement online feature selection from sparse streaming features, i.e., features flow in one by one with missing data as instance count remains fixed. To do so, this study proposes a latent-factor-analysis-based online sparse-streaming-feature selection algorithm (LOSSA). Its main idea is to apply latent factor analysis to pre-estimate missing data in sparse streaming features before conducting feature selection, thereby addressing the missing data issue effectively and efficiently. Theoretical and empirical studies indicate that LOSSA can significantly improve the quality of OSFS when missing data are encountered in target instances.
科研通智能强力驱动
Strongly Powered by AbleSci AI