亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

UniFormer: Unifying Convolution and Self-Attention for Visual Recognition

计算机科学 冗余(工程) 人工智能 判别式 模式识别(心理学) 变压器 机器学习 量子力学 操作系统 物理 电压
作者
Kunchang Li,Yali Wang,Junhao Zhang,Peng Gao,Guanglu Song,Yu Liu,Hongsheng Li,Yu Qiao
出处
期刊:IEEE Transactions on Pattern Analysis and Machine Intelligence [Institute of Electrical and Electronics Engineers]
卷期号:: 1-18 被引量:37
标识
DOI:10.1109/tpami.2023.3282631
摘要

It is a challenging task to learn discriminative representation from images and videos, due to large local redundancy and complex global dependency in these visual data. Convolution neural networks (CNNs) and vision transformers (ViTs) have been two dominant frameworks in the past few years. Though CNNs can efficiently decrease local redundancy by convolution within a small neighborhood, the limited receptive field makes it hard to capture global dependency. Alternatively, ViTs can effectively capture long-range dependency via self-attention, while blind similarity comparisons among all the tokens lead to high redundancy. To resolve these problems, we propose a novel Unified transFormer (UniFormer), which can seamlessly integrate the merits of convolution and self-attention in a concise transformer format. Different from the typical transformer blocks, the relation aggregators in our UniFormer block are equipped with local and global token affinity respectively in shallow and deep layers, allowing tackling both redundancy and dependency for efficient and effective representation learning. Finally, we flexibly stack our blocks into a new powerful backbone, and adopt it for various vision tasks from image to video domain, from classification to dense prediction. Without any extra training data, our UniFormer achieves 86.3 top-1 accuracy on ImageNet-1 K classification task. With only ImageNet-1 K pre-training, it can simply achieve state-of-the-art performance in a broad range of downstream tasks. It obtains 82.9/84.8 top-1 accuracy on Kinetics-400/600, 60.9/71.2 top-1 accuracy on Something-Something V1/V2 video classification tasks, 53.8 box AP and 46.4 mask AP on COCO object detection task, 50.8 mIoU on ADE20 K semantic segmentation task, and 77.4 AP on COCO pose estimation task. Moreover, we build an efficient UniFormer with a concise hourglass design of token shrinking and recovering, which achieves 2-4 $\bm {\times }$ higher throughput than the recent lightweight models. Code is available at https://github.com/Sense-X/UniFormer .
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
9秒前
小呆呆完成签到 ,获得积分10
44秒前
53秒前
llls完成签到 ,获得积分10
4分钟前
6分钟前
等于几都行完成签到 ,获得积分10
6分钟前
小青发布了新的文献求助10
6分钟前
6分钟前
10分钟前
研友_ZG4ml8完成签到 ,获得积分10
11分钟前
ss25发布了新的文献求助10
12分钟前
花痴的小松鼠完成签到 ,获得积分10
12分钟前
qiuxuan100完成签到,获得积分10
13分钟前
温大善人完成签到,获得积分10
13分钟前
科研通AI2S应助Daisy王大爷采纳,获得10
14分钟前
CipherSage应助ektyz采纳,获得10
14分钟前
14分钟前
ektyz发布了新的文献求助10
14分钟前
?......完成签到,获得积分10
15分钟前
15分钟前
16分钟前
16分钟前
16分钟前
Daisy王大爷完成签到,获得积分10
17分钟前
17分钟前
19分钟前
20分钟前
平常代天完成签到,获得积分10
21分钟前
21分钟前
21分钟前
仁爱的曼易完成签到,获得积分10
21分钟前
开放素完成签到 ,获得积分10
22分钟前
22分钟前
小青发布了新的文献求助10
22分钟前
平常代天发布了新的文献求助10
22分钟前
乐乐应助科研通管家采纳,获得10
22分钟前
简单幸福完成签到 ,获得积分10
24分钟前
24分钟前
yujie完成签到 ,获得积分10
25分钟前
魔法师完成签到,获得积分10
27分钟前
高分求助中
Teaching Social and Emotional Learning in Physical Education 900
Plesiosaur extinction cycles; events that mark the beginning, middle and end of the Cretaceous 800
Recherches Ethnographiques sue les Yao dans la Chine du Sud 500
Two-sample Mendelian randomization analysis reveals causal relationships between blood lipids and venous thromboembolism 500
Chinese-English Translation Lexicon Version 3.0 500
Wisdom, Gods and Literature Studies in Assyriology in Honour of W. G. Lambert 400
薩提亞模式團體方案對青年情侶輔導效果之研究 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2390789
求助须知:如何正确求助?哪些是违规求助? 2096356
关于积分的说明 5281260
捐赠科研通 1823743
什么是DOI,文献DOI怎么找? 909571
版权声明 559690
科研通“疑难数据库(出版商)”最低求助积分说明 486039