支持向量机
计算机科学
人工智能
模式识别(心理学)
核(代数)
过采样
边界判定
数据挖掘
机器学习
数学
计算机网络
带宽(计算)
组合数学
作者
Jiaqi Guo,Haiyan Wu,Xiaolei Chen,Weiguo Lin
标识
DOI:10.1016/j.asoc.2023.110986
摘要
In recent years, imbalanced data classification has emerged as a challenging task. To address this issue, we propose an adaptive SV-Borderline SMOTE-SVM (Synthetic Minority Oversampling Technique-Support Vector Machine) algorithm, specifically designed to overcome the challenges associated with imbalanced data classification. The algorithm begins by mapping the dataset into the kernel space using SVM to identify the class boundary samples, known as support vectors (SVs). Subsequently, the neighbors of positive sample’s support vector (SV+) are calculated based on the kernel distance. Based on the class distribution of these neighbors, the SV+ samples are labeled as either “concave” or “convex”. Based on these labels, new samples are adaptively generated using two distinct calculation approaches for different labeled SV+ samples. To construct the SVM decision function without requiring the explicit expression of new samples in the kernel space, a Gram matrix is designed. Notably, all the processes ensure the credibility and reliability of the new samples. Additionally, the adaptive interpolation approach helps to ensure the security and diversity of new samples. Extensive experiments were conducted on a set of 50 KEEL datasets to evaluate the performance of our proposed method for imbalanced data classification. In experiments, our method achieved the highest G-mean score in 33 datasets and the highest F-values in 32 datasets. These results highlight the effectiveness and superiority of our proposed method compared to other approaches in addressing the challenges of imbalanced data classification.
科研通智能强力驱动
Strongly Powered by AbleSci AI