过采样
过度拟合
计算机科学
k-最近邻算法
人工智能
数据挖掘
集合(抽象数据类型)
机器学习
模式识别(心理学)
算法
带宽(计算)
计算机网络
人工神经网络
程序设计语言
作者
R. R. Das,Saroj Kr. Biswas,Debashree Devi,Biswajit Dev Sarma
出处
期刊:2020 International Conference on Smart Electronics and Communication (ICOSEC)
日期:2020-09-01
卷期号:: 1239-1244
被引量:19
标识
DOI:10.1109/icosec49089.2020.9215387
摘要
In recent years, the classification problem of an imbalanced dataset is getting a high demand in the field of machine learning. The SMOTE (Synthetic Minority Oversampling Technique) is a traditional approach to solve this issue. The main drawback of SMOTE is the issue of overfitting, as it randomly synthesized the minority data samples taking no notice of the significance of the majority class. To solve this problem, the paper proposes a new algorithm named as Reverse-Synthetic Minority Oversampling Technique (R-SMOTE), based on SMOTE and Reverse-Nearest Neighbor (R-NN). The proposed R-SMOTE extracts a significant set of data points out of the minority class and considers that set to synthesize new samples from their reverse nearest neighbors. The proposed algorithm is compared with four standard oversampling techniques. From the empirical analysis, it is observed that the proposed R-SMOTE had produced much improved results over the existing oversampling methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI