人工智能
模式识别(心理学)
计算机科学
局部二进制模式
直方图
分类器(UML)
支持向量机
接收机工作特性
特征(语言学)
特征向量
合成数据
机器学习
图像(数学)
语言学
哲学
作者
Renhao Liu,Lawrence Hall,Kevin W. Bowyer,Dmitry B. Goldgof,Robert A. Gatenby,Kaoutar Ben Ahmed
标识
DOI:10.1109/smc.2017.8122802
摘要
Real-world datasets are often imbalanced, with an important class having many fewer examples than other classes. In medical data, normal examples typically greatly outnumber disease examples. A classifier learned from imbalanced data, will tend to be very good at the predicting examples in the larger (normal) class, yet the smaller (disease) class is typically of more interest. Imbalance is dealt with at the feature vector level (create synthetic feature vectors or discard some examples from the larger class) or by assigning differential costs to errors. Here, we introduce a novel method for over-sampling minority class examples at the image level, rather than the feature vector level. Our method was applied to the problem of Glioblastoma patient survival group prediction. Synthetic minority class examples were created by adding Gaussian noise to original medical images from the minority class. Uniform local binary patterns (LBP) histogram features were then extracted from the original and synthetic image examples with a random forests classifier. Experimental results show the new method (Image SMOTE) increased minority class predictive accuracy and also the AUC (area under the receiver operating characteristic curve), compared to using the imbalanced dataset directly or to creating synthetic feature vectors.
科研通智能强力驱动
Strongly Powered by AbleSci AI