已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

An Ensemble Learning Approach with Gradient Resampling for Class-Imbalance Problems

重采样 Boosting(机器学习) 计算机科学 机器学习 人工智能 集成学习 采样(信号处理) 班级(哲学) 样品(材料) 集合(抽象数据类型) 滤波器(信号处理) 算法 化学 色谱法 计算机视觉 程序设计语言
作者
Shankun Zhao,Chuang Zhao,Xi Zhang,Nanlin Liu,Hengshu Zhu,Qi Liu,Hui Xiong
出处
期刊:Informs Journal on Computing 卷期号:35 (4): 747-763 被引量:2
标识
DOI:10.1287/ijoc.2023.1274
摘要

Imbalanced classification is widely referred in many real-world applications and has been extensively studied. Most existing algorithms consider alleviating the imbalance by sampling or guiding ensemble learners with punishments. The combination of ensemble learning and sampling strategy at class level has achieved great progress. Actually, specific hard examples have little benefit for model learning and even degrade the performance. From the view of identifying classification difficulty of samples, one important motivation is to design algorithms to finely equip different samples with progressive learning. Unfortunately, how to perfectly configure the sampling and learning strategies under ensemble principles at the sample level remains a research gap. In this paper, we propose a new view from the sample level rather than class level in existing studies. We design an ensemble approach in pipe with sample-level gradient resampling, that is, balanced cascade with filters (BCWF). Before that, as a preliminary exploration, we first design a hard examples mining algorithm to explore the gradient distribution of classification difficulty of samples and identify the hard examples. Specifically, BCWF uses an under-sampling strategy and a boosting manner to train T predictive classifiers and reidentify hard examples. In BCWF, moreover, we design two types of filters: the first is assembled with a hard filter (BCWF_h), whereas the second is assembled with a soft filter (BCWF_s). In each round of boosting, BCWF_h strictly removes a gradient/set of the hardest examples from both classes, whereas BCWF_s removes a larger number of harder and easy examples simultaneously for final balanced-class retention. Consequently, the well-trained T predictive classifiers can be used with two ensemble voting strategies: average probability and majority vote. To evaluate the proposed approach, we conduct intensive experiments on 10 benchmark data sets and apply our algorithms to perform default user detection on a real-world peer to peer lending data set. The experimental results fully demonstrate the effectiveness and the managerial implications of our approach when compared with 11 competitive algorithms. History: Accepted by Ram Ramesh, Area Editor for Data Science & Machine Learning. Funding: This work was supported by the National Natural Science Foundation of China [Grants 72101176, 71722005, and 72241432], the National Key R&D program of China [Grant 2020YFA0908600] and the Natural Science Foundation of Tianjin City [Grant 18JCJQJC45900]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2023.1274 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2021.0104 ) at ( http://dx.doi.org/10.5281/zenodo.6360996 ).
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
畅快之柔完成签到,获得积分10
2秒前
吕博发布了新的文献求助10
5秒前
Jasper应助诸葛钢铁采纳,获得10
6秒前
sxy发布了新的文献求助30
7秒前
Jasper应助原野采纳,获得10
10秒前
10秒前
11秒前
12秒前
12秒前
清逸之风发布了新的文献求助10
14秒前
Xuuuurj发布了新的文献求助10
15秒前
wanbochen完成签到 ,获得积分10
20秒前
22秒前
27秒前
hxy发布了新的文献求助10
29秒前
34秒前
37秒前
38秒前
小蘑菇应助Xuuuurj采纳,获得10
39秒前
c138zyx完成签到,获得积分10
45秒前
47秒前
47秒前
51秒前
52秒前
优雅莞发布了新的文献求助10
52秒前
53秒前
SOLOMON应助卑微小松鼠采纳,获得10
53秒前
善良语风发布了新的文献求助10
56秒前
温暖小懒虫完成签到,获得积分10
56秒前
ww发布了新的文献求助10
57秒前
59秒前
缓慢的翅膀完成签到,获得积分10
1分钟前
ksak607155完成签到,获得积分10
1分钟前
天宁发布了新的文献求助10
1分钟前
啦啦啦完成签到,获得积分10
1分钟前
1分钟前
1分钟前
1分钟前
1分钟前
高分求助中
Manual of Clinical Microbiology, 4 Volume Set (ASM Books) 13th Edition 1000
Teaching Social and Emotional Learning in Physical Education 900
The three stars each : the Astrolabes and related texts 550
Boris Pesce - Gli impiegati della Fiat dal 1955 al 1999 un percorso nella memoria 500
Chinese-English Translation Lexicon Version 3.0 500
少脉山油柑叶的化学成分研究 500
Recherches Ethnographiques sue les Yao dans la Chine du Sud 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2400070
求助须知:如何正确求助?哪些是违规求助? 2100772
关于积分的说明 5296409
捐赠科研通 1828480
什么是DOI,文献DOI怎么找? 911334
版权声明 560198
科研通“疑难数据库(出版商)”最低求助积分说明 487125