X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval

计算机科学 相似性(几何) 对比度(视觉) 光学(聚焦) 特征(语言学) 人工智能 滤波器(信号处理) 情报检索 情态动词 模式识别(心理学) 计算机视觉 图像(数学) 语言学 哲学 物理 化学 高分子化学 光学
作者
Yiwei Ma,Guohai Xu,Xiaoshuai Sun,Ming Yan,Ji Zhang,Rongrong Ji
标识
DOI:10.1145/3503161.3547910
摘要

Video-text retrieval has been a crucial and fundamental task in multi-modal research. The development of video-text retrieval has been considerably promoted by large-scale multi-modal contrastive pre-training, which primarily focuses on coarse-grained or fine-grained contrast. However, cross-grained contrast, which is the contrast between coarse-grained representations and fine-grained representations, has rarely been explored in prior research. Compared with fine-grained or coarse-grained contrasts, cross-grained contrast calculate the correlation between coarse-grained features and each fine-grained feature, and is able to filter out the unnecessary fine-grained features guided by the coarse-grained feature during similarity calculation, thus improving the accuracy of retrieval. To this end, this paper presents a novel multi-grained contrastive model, namely X-CLIP, for video-text retrieval. However, another challenge lies in the similarity aggregation problem, which aims to aggregate fine-grained and cross-grained similarity matrices to instance-level similarity. To address this challenge, we propose the Attention Over Similarity Matrix (AOSM) module to make the model focus on the contrast between essential frames and words, thus lowering the impact of unnecessary frames and words on retrieval results. With multi-grained contrast and the proposed AOSM module, X-CLIP achieves outstanding performance on five widely-used video-text retrieval datasets, including MSR-VTT (49.3 [email protected]), MSVD (50.4 [email protected]), LSMDC (26.1 [email protected]), DiDeMo (47.8 [email protected]) and ActivityNet (46.2 [email protected]).
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
LXN发布了新的文献求助10
刚刚
木吉发布了新的文献求助10
1秒前
snailye关注了科研通微信公众号
1秒前
爱学术的LaoD完成签到,获得积分10
1秒前
1秒前
1秒前
ZK999完成签到,获得积分10
2秒前
BOYA完成签到,获得积分10
2秒前
Orange应助辉哥采纳,获得10
2秒前
核桃发布了新的文献求助10
3秒前
YCH完成签到,获得积分10
3秒前
清脆诗兰完成签到 ,获得积分10
3秒前
虚幻的仙人掌完成签到,获得积分10
3秒前
yehhh发布了新的文献求助10
3秒前
3秒前
丁驰完成签到,获得积分10
3秒前
王火火完成签到 ,获得积分10
3秒前
JayeChen完成签到,获得积分10
4秒前
小小完成签到,获得积分10
4秒前
HZC发布了新的文献求助10
4秒前
卡黄99完成签到,获得积分10
4秒前
4秒前
powerfuled完成签到,获得积分10
5秒前
笑笑发布了新的文献求助10
5秒前
沉默寄凡完成签到,获得积分10
5秒前
dkm完成签到,获得积分10
5秒前
yangtao199发布了新的文献求助10
5秒前
玉玉鼠完成签到,获得积分10
6秒前
chen完成签到,获得积分10
6秒前
chen完成签到,获得积分10
7秒前
阔达碧琴完成签到,获得积分10
7秒前
吴所谓完成签到,获得积分10
7秒前
华山完成签到,获得积分10
8秒前
标致以云完成签到,获得积分10
9秒前
沉默寄凡发布了新的文献求助10
9秒前
lxcy0612完成签到,获得积分10
9秒前
小杨完成签到,获得积分10
9秒前
HangSun发布了新的文献求助10
9秒前
bluekids完成签到,获得积分10
9秒前
g7001完成签到,获得积分10
9秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Les Mantodea de Guyane Insecta, Polyneoptera 2000
Leading Academic-Practice Partnerships in Nursing and Healthcare: A Paradigm for Change 800
Signals, Systems, and Signal Processing 610
Research Methods for Business: A Skill Building Approach, 9th Edition 500
Research Methods for Applied Linguistics 500
Picture Books with Same-sex Parented Families Unintentional Censorship 444
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6414089
求助须知:如何正确求助?哪些是违规求助? 8232863
关于积分的说明 17478627
捐赠科研通 5466990
什么是DOI,文献DOI怎么找? 2888549
邀请新用户注册赠送积分活动 1865542
关于科研通互助平台的介绍 1703257