计算机科学
特征选择
特征(语言学)
人工智能
模式识别(心理学)
选择(遗传算法)
语言学
哲学
作者
Shengda Zhuo,Jin-Jie Qiu,Chang‐Dong Wang,Shuqiang Huang
标识
DOI:10.1109/tkde.2024.3377243
摘要
Feature selection, an essential technique in data mining, is often confined to batch learning or online idealization of data scenarios despite its significance. Existing online feature selection methods have specific assumptions regarding the data stream, such as requiring a fixed feature space with an explicit pattern and complete labeling of samples. Unfortunately, data streams generated in many real scenarios commonly exhibit arbitrarily incomplete feature spaces and scarcity labels, making existing approaches unsuitable for real applications. To fill these gaps, this study proposes a new problem called Online Feature Selection with Varying Features Spaces (OFSVF). OFSVF has a three-fold main idea: 1) it leverages Gaussian Copula to model the incomplete feature correlation in a complete latent space, encoded by continuous variables, 2) it employs a novel tree-ensemble-based approach to select the most informative features on-the-fly, and 3) it develops the underlying geometric structure of instances to establish the relationship between unlabeled and labels. Experimental results are documented to demonstrate the feasibility and effectiveness of our proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI