过采样
计算机科学
机器学习
人工智能
预处理器
分类器(UML)
集成学习
数据挖掘
带宽(计算)
计算机网络
作者
Sharath Nittur Sridhar,A. Kalaivani
出处
期刊:2020 International Conference on Decision Aid Sciences and Application (DASA)
日期:2020-11-08
被引量:1
标识
DOI:10.1109/dasa51403.2020.9317019
摘要
In real world applications, it is very common that the data skewness occurs among multiple classes. Several studies and various attempts were made in the past to overcome this imbalance problem which is a serious issue to the standard machine learning techniques especially classification and regression but, still there exists a need to handle the imbalance problem effectively. Datasets which are imbalanced generally include safe and unsafe minority samples. Our proposed approach is a classifier independent two tier iterative ensemble approach which focuses the rare minority sample's influence on learning from imbalanced datasets. Most of the informed oversampling techniques like SMOTE and its variants cannot be applied directly on rare class samples especially when the count of rare samples is too low. To alleviate this problem, in our proposed approach to learn from rare and outlying samples we proposed a hybrid oversampling technique used at different levels and make them balanced. The goal is to tone down the data imbalance at the data preprocessing stage itself by correcting or balancing the training data sets before moving to the learning part which makes the classifier to focus on its primary role and thereby it improves the learning process. The proposed two tier iterative ensemble approach shows a much significant improvement in the learning process among the multiclass imbalanced data which is clearly evident with the experimental results.
科研通智能强力驱动
Strongly Powered by AbleSci AI