清晨好,您是今天最早来到科研通的研友!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您科研之路漫漫前行!

Bayes Imbalance Impact Index: A Measure of Class Imbalanced Data Set for Classification Problem

人工智能 机器学习 班级(哲学) 模式识别(心理学) 特征选择 朴素贝叶斯分类器 支持向量机 多类分类 度量(数据仓库) 一级分类 数学 统计分类
作者
Yang Lu,Yiu-ming Cheung,Yuan Yan Tang
出处
期刊:IEEE Transactions on Neural Networks [Institute of Electrical and Electronics Engineers]
卷期号:31 (9): 3525-3539 被引量:13
标识
DOI:10.1109/tnnls.2019.2944962
摘要

Recent studies of imbalanced data classification have shown that the imbalance ratio (IR) is not the only cause of performance loss in a classifier, as other data factors, such as small disjuncts, noise, and overlapping, can also make the problem difficult. The relationship between the IR and other data factors has been demonstrated, but to the best of our knowledge, there is no measurement of the extent to which class imbalance influences the classification performance of imbalanced data. In addition, it is also unknown which data factor serves as the main barrier for classification in a data set. In this article, we focus on the Bayes optimal classifier and examine the influence of class imbalance from a theoretical perspective. We propose an instance measure called the Individual Bayes Imbalance Impact Index (IBI3) and a data measure called the Bayes Imbalance Impact Index (BI3). IBI3 and BI3 reflect the extent of influence using only the imbalance factor, in terms of each minority class sample and the whole data set, respectively. Therefore, IBI3 can be used as an instance complexity measure of imbalance and BI3 as a criterion to demonstrate the degree to which imbalance deteriorates the classification of a data set. We can, therefore, use BI3 to access whether it is worth using imbalance recovery methods, such as sampling or cost-sensitive methods, to recover the performance loss of a classifier. The experiments show that IBI3 is highly consistent with the increase of the prediction score obtained by the imbalance recovery methods and that BI3 is highly consistent with the improvement in the F1 score obtained by the imbalance recovery methods on both synthetic and real benchmark data sets.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
秋夜临完成签到,获得积分10
1秒前
燊燊完成签到 ,获得积分0
6秒前
文献搬运工完成签到 ,获得积分10
15秒前
17秒前
joalf发布了新的文献求助10
17秒前
小居很哇塞完成签到,获得积分10
23秒前
actor2006完成签到,获得积分10
37秒前
50完成签到 ,获得积分10
53秒前
在下诸葛完成签到 ,获得积分10
1分钟前
蒋中豪2.0完成签到 ,获得积分10
1分钟前
孟萌完成签到 ,获得积分10
1分钟前
wxl完成签到,获得积分10
1分钟前
ZaZa完成签到,获得积分10
2分钟前
shiminyuan完成签到,获得积分10
2分钟前
tingalan完成签到,获得积分10
2分钟前
HMR完成签到 ,获得积分10
2分钟前
peiter完成签到 ,获得积分10
2分钟前
小小怪完成签到 ,获得积分10
3分钟前
black_cavalry完成签到,获得积分10
3分钟前
邢夏之完成签到 ,获得积分10
4分钟前
无辜的行云完成签到 ,获得积分10
4分钟前
Lee_Ice发布了新的文献求助10
4分钟前
寻道图强应助Sabrina1018采纳,获得10
4分钟前
liuyong6413完成签到 ,获得积分10
4分钟前
景代丝完成签到,获得积分10
4分钟前
张丫丫完成签到,获得积分10
4分钟前
violetlishu完成签到 ,获得积分10
5分钟前
lixiang完成签到 ,获得积分10
5分钟前
精壮小伙完成签到,获得积分10
5分钟前
山鸟与鱼不同路完成签到 ,获得积分10
5分钟前
上官若男应助Lee_Ice采纳,获得10
6分钟前
6分钟前
Lee_Ice发布了新的文献求助10
6分钟前
流浪的鲨鱼完成签到,获得积分10
6分钟前
Lee_Ice完成签到,获得积分10
6分钟前
jkaaa完成签到,获得积分10
7分钟前
7分钟前
贝贝完成签到,获得积分0
7分钟前
三千发布了新的文献求助10
7分钟前
清秀的怀蕊完成签到 ,获得积分10
7分钟前
高分求助中
Teaching Social and Emotional Learning in Physical Education 900
Boris Pesce - Gli impiegati della Fiat dal 1955 al 1999 un percorso nella memoria 500
Chinese-English Translation Lexicon Version 3.0 500
Recherches Ethnographiques sue les Yao dans la Chine du Sud 500
Two-sample Mendelian randomization analysis reveals causal relationships between blood lipids and venous thromboembolism 500
[Lambert-Eaton syndrome without calcium channel autoantibodies] 460
Aspect and Predication: The Semantics of Argument Structure 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2396239
求助须知:如何正确求助?哪些是违规求助? 2098717
关于积分的说明 5289110
捐赠科研通 1826062
什么是DOI,文献DOI怎么找? 910497
版权声明 560007
科研通“疑难数据库(出版商)”最低求助积分说明 486633