GQNA: Generic Quantized DNN Accelerator With Weight-Repetition-Aware Activation Aggregating

量化(信号处理) 计算 计算机科学 算法 推论 核(代数) 数学 人工智能 离散数学
作者
Jianxun Yang,Fengbin Tu,Yixuan Li,Yiqi Wang,Leibo Liu,Shaojun Wei,Shouyi Yin
出处
期刊:IEEE Transactions on Circuits and Systems I-regular Papers [Institute of Electrical and Electronics Engineers]
卷期号:69 (10): 4069-4082 被引量:10
标识
DOI:10.1109/tcsi.2022.3188899
摘要

Quantization is a prominent approach to compress model sizes of deep neural networks (DNNs), which clusters high-precision weights into a smaller set of quantization levels and represents high-precision weights by low-precision indexes. To achieve the same accuracy, nonuniform quantized DNNs (NUQ-DNNs) with unequal quantization intervals need lower index precision than uniform quantized DNNs (UQ-DNNs) with equal intervals, achieving smaller model sizes. Hence, deploying NUQ-DNNs on accelerators costs less on- and off-chip memory accesses than UQ-DNNs, which are more valuable for edge devices. However, accelerating NUQ-DNNs is nontrivial, since weight indexes cannot be directly used for computations. Previous NUQ-DNN accelerators adopt standard convolutions by decoding weight indexes into actual-weights multiplied with activations, causing abundant look-up overhead and redundant computations. In this work, we propose a weight-repetition-aware activation aggregating (WPAA) convolution approach to accelerate inference of variable-precision NUQ- and UQ-DNNs. By merging convolutions of multiple kernels, WPAA requires no look-up operation and removes redundant computations. Based on WPAA, we design a generic quantized DNN accelerator (GQNA). Furthermore, we propose a layer-adaptive kernel-reordering merging scheme to off-line adjust merging order of kernels for minimizing energy consumption of GQNA. Implemented under TSMC 28-nm technology, GQNA achieves 31.9 and 32.6 TOPS/W energy efficiency for 1-b UQ- and NUQ-VGG-16, respectively.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
pufanlg发布了新的文献求助10
1秒前
Akim应助文静的猕猴桃采纳,获得10
1秒前
1秒前
1秒前
大力的银耳汤完成签到,获得积分10
1秒前
GJM应助zhendou采纳,获得30
1秒前
2秒前
整齐含之完成签到,获得积分10
2秒前
爆米花应助因生如沫采纳,获得10
2秒前
李佳笑完成签到,获得积分10
3秒前
3秒前
LGZ发布了新的文献求助10
3秒前
Yff完成签到,获得积分20
3秒前
香蕉觅云应助tutu采纳,获得10
3秒前
饱满以云发布了新的文献求助20
3秒前
3秒前
3秒前
3秒前
慕青应助he采纳,获得10
4秒前
深情安青应助长江长采纳,获得10
4秒前
4秒前
yy发布了新的文献求助10
4秒前
冷傲疾应助梧桐采纳,获得10
5秒前
chili发布了新的文献求助10
5秒前
jing完成签到,获得积分10
5秒前
zzz发布了新的文献求助10
5秒前
ZM发布了新的文献求助20
6秒前
6秒前
Ttt发布了新的文献求助10
6秒前
zirao123关注了科研通微信公众号
6秒前
李婉婷发布了新的文献求助10
6秒前
916应助鲤鱼晓蓝采纳,获得10
7秒前
7秒前
hyacinth11111完成签到,获得积分10
7秒前
111发布了新的文献求助10
7秒前
7秒前
好人一生平安完成签到,获得积分10
7秒前
jmwtong发布了新的文献求助10
7秒前
7秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Kinesiophobia : a new view of chronic pain behavior 5000
Molecular Biology of Cancer: Mechanisms, Targets, and Therapeutics 3000
First commercial application of ELCRES™ HTV150A film in Nichicon capacitors for AC-DC inverters: SABIC at PCIM Europe 1000
Feldspar inclusion dating of ceramics and burnt stones 1000
What is the Future of Psychotherapy in a Digital Age? 801
The Psychological Quest for Meaning 800
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5983372
求助须知:如何正确求助?哪些是违规求助? 7381252
关于积分的说明 16031136
捐赠科研通 5123516
什么是DOI,文献DOI怎么找? 2749462
邀请新用户注册赠送积分活动 1719558
关于科研通互助平台的介绍 1625668