Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-Level Sparsity via Mixture-of-Experts

计算机科学 变压器 建筑 任务(项目管理) 计算机体系结构 任务分析 人机交互 人工智能 嵌入式系统 工程类 电气工程 系统工程 电压 艺术 视觉艺术
作者
Rishov Sarkar,Hanxue Liang,Zhiwen Fan,Zhangyang Wang,Cong Hao
标识
DOI:10.1109/iccad57390.2023.10323651
摘要

The computer vision community is embracing two promising learning paradigms: the Vision Transformer (ViT) and Multi-task Learning (MTL). ViT models show extraordinary performance over traditional convolution networks but are commonly recognized as computation-intensive, especially the self-attention with quadratic complexity. MTL uses one model to infer multiple tasks with better performance by enforcing shared representation among tasks, but a huge drawback is that, most MTL regimes require activation of the entire model even when only one or a few tasks are needed, causing significant computing waste. M 3 ViT is the latest multi-task Vi $T$ model that introduces mixture-of-experts (MoE), where only a small portion of subnetworks (“experts”) are sparsely and dynamically activated based on the current task. M 3 Vi $T$ achieves better accuracy and over 80% computation reduction and paves the way for efficient real-time MTL using ViT. Despite the algorithmic advantages of MTL, ViT, and even M 3 ViT, there are still many challenges for efficient deployment on FPGA. For instance, in general Transformer/ViT models, the self-attention is known as computational intensive and requires high bandwidth. In addition, softmax operations and the activation function GELU are extensively used, which unfortunately can consume more than half of the entire FPGA resource (LUTs). In the M 3 ViT model, the promising MoE mechanism for multi-task exposes new challenges for memory access overhead and also increases resource usage because of more layer types. To address these challenges in both general Transformer/ViT models and the state-of-the-art multi-task M 3 ViT with MoE, we propose Edge-MoE, the first end-to-end FPGA accelerator for multi-task ViT with a rich collection of architectural innovations. First, for general Transformer/ViT models, we propose (1) a novel reordering mechanism for self-attention, which reduces the bandwidth requirement from proportional to constant regardless of the target parallelism; (2) a fast single-pass softmax approximation; (3) an accurate and low-cost GELU approximation, which can significantly reduce the computation latency and resource usage; and (4) a unified and flexible computing unit that can be shared by almost all computational layers to maximally reduce resource usage. Second, for the advanced multi-task M 3 ViT with MoE, we propose a novel patch reordering method to completely eliminate any memory access overhead. Third, we deliver on-board implementation and measurement on Xilinx ZCU102 FPGA, with verified functionality and open-sourced hardware design, which achieves 2.24× and 4.90× better energy efficiency comparing with GPU (A6000) and CPU (Xeon 6226R), respectively. A real-time video demonstration of our accelerated multi-task ViT on an autonomous driving dataset is available in GitHub, 1 1 https://github.com/sharc-lab/Edge-MoE/raw/main/demo.mp4 together with our FPGA design using High-Level Synthesis, host code, FPGA bitstream, and on-board performance results.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
car完成签到 ,获得积分10
刚刚
1秒前
2秒前
上林春漫完成签到,获得积分10
2秒前
2秒前
2秒前
2秒前
阿庆完成签到,获得积分10
2秒前
Leona666发布了新的文献求助10
2秒前
2秒前
健壮可冥发布了新的文献求助10
3秒前
冰海完成签到,获得积分10
3秒前
毕奥萨伐尔完成签到,获得积分10
3秒前
冷傲抽屉发布了新的文献求助10
3秒前
科研通AI6.3应助Nido采纳,获得50
3秒前
dududu发布了新的文献求助10
3秒前
ZhouZhoukkk发布了新的文献求助10
4秒前
漫漫完成签到,获得积分10
4秒前
4秒前
DYY完成签到,获得积分10
4秒前
初景发布了新的文献求助10
4秒前
怎么办完成签到,获得积分10
4秒前
cc完成签到,获得积分10
4秒前
雨霖铃完成签到,获得积分10
4秒前
5秒前
5秒前
bubi发布了新的文献求助10
5秒前
QIAOQIAO发布了新的文献求助10
5秒前
6秒前
6秒前
宗佳茹发布了新的文献求助10
6秒前
7秒前
henzhidequxian完成签到,获得积分10
7秒前
kaka发布了新的文献求助10
7秒前
冬无青山发布了新的文献求助10
7秒前
木南完成签到,获得积分10
8秒前
Tr0c完成签到,获得积分10
8秒前
坦率晓夏完成签到,获得积分10
8秒前
积极的逍遥完成签到,获得积分10
9秒前
biubiu发布了新的文献求助10
9秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Picture this! Including first nations fiction picture books in school library collections 2000
The Composition and Relative Chronology of Dynasties 16 and 17 in Egypt 1500
Cowries - A Guide to the Gastropod Family Cypraeidae 1200
ON THE THEORY OF BIRATIONAL BLOWING-UP 666
Signals, Systems, and Signal Processing 610
“美军军官队伍建设研究”系列(全册) 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6386305
求助须知:如何正确求助?哪些是违规求助? 8200045
关于积分的说明 17347067
捐赠科研通 5440048
什么是DOI,文献DOI怎么找? 2876881
邀请新用户注册赠送积分活动 1853274
关于科研通互助平台的介绍 1697369