EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention

计算机科学 计算 氙气 并行计算 加速 变压器 冗余(工程) 超级计算机 专用集成电路 至强融核 算法 计算机硬件 操作系统 物理 量子力学 电压
作者
Xinyu Liu,Houwen Peng,Ningxin Zheng,Yuqing Yang,Han Huang,Yixuan Yuan
标识
DOI:10.1109/cvpr52729.2023.01386
摘要

Vision transformers have shown great success due to their high model capabilities. However, their remarkable performance is accompanied by heavy computation costs, which makes them unsuitable for real-time applications. In this paper, we propose a family of high-speed vision transformers named Efficient ViT. We find that the speed of existing transformer models is commonly bounded by memory inefficient operations, especially the tensor reshaping and element-wise functions in MHSA. Therefore, we design a new building block with a sandwich layout, i.e., using a single memory-bound MHSA between efficient FFN layers, which improves memory efficiency while enhancing channel communication. Moreover, we discover that the attention maps share high similarities across heads, leading to computational redundancy. To address this, we present a cascaded group attention module feeding attention heads with different splits of the full feature, which not only saves computation cost but also improves attention diversity. Comprehensive experiments demonstrate EfficientViT outperforms existing efficient models, striking a good trade-off between speed and accuracy. For instance, our EfficientViT-M5 surpasses MobileNetV3-Large by 1.9% in accuracy, while getting 40.4% and 45.2% higher throughput on Nvidia V100 GPU and Intel Xeon CPU, respectively. Compared to the recent efficient model MobileViT-XXS, EfficientViT-M2 achieves 1.8% superior accuracy, while running $5.8\times/3.7\times$ faster on the GPU/CPU, and $7.4\times faster$ when converted to ONNX format. Code and models are available at here.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
小二郎应助slowstar采纳,获得10
1秒前
2秒前
SciGPT应助沐风采纳,获得10
2秒前
5秒前
5秒前
李琪琪完成签到,获得积分10
6秒前
lll完成签到,获得积分20
6秒前
星辰大海应助早早采纳,获得10
6秒前
chenzi发布了新的文献求助10
7秒前
没有蛀牙发布了新的文献求助10
8秒前
李大白发布了新的文献求助10
9秒前
10秒前
桐桐应助奥利安费采纳,获得30
11秒前
对方正在输入中完成签到,获得积分10
11秒前
wanci应助lll采纳,获得10
12秒前
12秒前
淡淡的蓉发布了新的文献求助10
14秒前
SOLOMON举报yh求助涉嫌违规
16秒前
十八发布了新的文献求助10
17秒前
17秒前
NexusExplorer应助科研小废物采纳,获得10
17秒前
17秒前
李健的小迷弟应助ordin采纳,获得10
19秒前
传奇3应助yhy采纳,获得10
19秒前
潘科学家发布了新的文献求助10
20秒前
21秒前
21秒前
22秒前
pp发布了新的文献求助10
22秒前
田様应助lurenjia009采纳,获得10
22秒前
lin关闭了lin文献求助
23秒前
23秒前
上官若男应助ssds采纳,获得10
23秒前
24秒前
Jasper应助不混不摆烂采纳,获得10
26秒前
26秒前
赘婿应助reece采纳,获得10
27秒前
奥利安费发布了新的文献求助30
28秒前
29秒前
29秒前
高分求助中
The three stars each : the Astrolabes and related texts 1070
Manual of Clinical Microbiology, 4 Volume Set (ASM Books) 13th Edition 1000
Sport in der Antike 800
De arte gymnastica. The art of gymnastics 600
少脉山油柑叶的化学成分研究 530
Sport in der Antike Hardcover – March 1, 2015 500
Boris Pesce - Gli impiegati della Fiat dal 1955 al 1999 un percorso nella memoria 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2405486
求助须知:如何正确求助?哪些是违规求助? 2103696
关于积分的说明 5309706
捐赠科研通 1831232
什么是DOI,文献DOI怎么找? 912415
版权声明 560646
科研通“疑难数据库(出版商)”最低求助积分说明 487794