Counterfactual Samples Synthesizing for Robust Visual Question Answering

答疑 反事实思维 计算机科学 集合(抽象数据类型) 人工智能 训练集 自然语言处理 试验装置 语言模型 光学(聚焦) 基本事实 机器学习 物理 哲学 光学 程序设计语言 认识论
作者
Long Chen,Xin Yan,Jun Xiao,Hanwang Zhang,Shiliang Pu,Yueting Zhuang
标识
DOI:10.1109/cvpr42600.2020.01081
摘要

Despite Visual Question Answering (VQA) has realized impressive progress over the last few years, today's VQA models tend to capture superficial linguistic correlations in the train set and fail to generalize to the test set with different QA distributions. To reduce the language biases, several recent works introduce an auxiliary question-only model to regularize the training of targeted VQA model, and achieve dominating performance on VQA-CP. However, since the complexity of design, current methods are unable to equip the ensemble-based models with two indispensable characteristics of an ideal VQA model: 1) visual-explainable: the model should rely on the right visual regions when making decisions. 2) question-sensitive: the model should be sensitive to the linguistic variations in question. To this end, we propose a model-agnostic Counterfactual Samples Synthesizing (CSS) training scheme. The CSS generates numerous counterfactual training samples by masking critical objects in images or words in questions, and assigning different ground-truth answers. After training with the complementary samples (ie, the original and generated samples), the VQA models are forced to focus on all critical objects and words, which significantly improves both visual-explainable and question-sensitive abilities. In return, the performance of these models is further boosted. Extensive ablations have shown the effectiveness of CSS. Particularly, by building on top of the model LMH, we achieve a record-breaking performance of 58.95% on VQA-CP v2, with 6.5% gains.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
space完成签到 ,获得积分10
2秒前
bee完成签到 ,获得积分10
3秒前
9秒前
高源高源完成签到 ,获得积分10
12秒前
SCI完成签到 ,获得积分10
14秒前
16秒前
cdercder应助kongkong采纳,获得10
25秒前
wp4455777完成签到,获得积分10
26秒前
Iwan完成签到,获得积分10
27秒前
27秒前
自信放光芒~完成签到 ,获得积分10
28秒前
ROMANTIC完成签到 ,获得积分10
31秒前
上官若男应助Iwan采纳,获得10
32秒前
钟声完成签到,获得积分0
34秒前
lisa完成签到 ,获得积分10
38秒前
倾卿如玉完成签到 ,获得积分10
38秒前
41秒前
fatcat完成签到,获得积分10
44秒前
taoxz521完成签到 ,获得积分10
47秒前
BINBIN完成签到 ,获得积分10
50秒前
52秒前
yaoyao完成签到,获得积分10
57秒前
从容的无极完成签到,获得积分10
1分钟前
毛毛弟完成签到 ,获得积分10
1分钟前
changyongcheng完成签到 ,获得积分10
1分钟前
唯梦完成签到 ,获得积分10
1分钟前
lalala完成签到 ,获得积分10
1分钟前
kongkong完成签到,获得积分10
1分钟前
qqaeao完成签到,获得积分10
1分钟前
CipherSage应助vivi采纳,获得10
1分钟前
1分钟前
Lexi完成签到 ,获得积分10
1分钟前
firewood完成签到,获得积分10
1分钟前
龙王爱吃糖完成签到 ,获得积分10
1分钟前
1分钟前
cdercder应助科研通管家采纳,获得10
1分钟前
cdercder应助科研通管家采纳,获得10
1分钟前
1分钟前
发财小鱼完成签到 ,获得积分10
1分钟前
华理附院孙文博完成签到 ,获得积分10
1分钟前
高分求助中
Thinking Small and Large 500
Algorithmic Mathematics in Machine Learning 500
Mapping the Stars: Celebrity, Metonymy, and the Networked Politics of Identity 400
Getting Published in SSCI Journals: 200+ Questions and Answers for Absolute Beginners 300
Engineering the boosting of the magnetic Purcell factor with a composite structure based on nanodisk and ring resonators 240
Study of enhancing employee engagement at workplace by adopting internet of things 200
Minimum Bar Spacing as a Function of Bond and Shear Strength 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3837567
求助须知:如何正确求助?哪些是违规求助? 3379673
关于积分的说明 10510120
捐赠科研通 3099308
什么是DOI,文献DOI怎么找? 1707029
邀请新用户注册赠送积分活动 821402
科研通“疑难数据库(出版商)”最低求助积分说明 772615