Post-training Quantization for Neural Networks with Provable Guarantees

量化(信号处理) 计算机科学 人工神经网络 卷积神经网络 二进制数 算法 计算 深层神经网络 均方误差 人工智能 数学 算术 统计
作者
Jinjie Zhang,Yong‐Gui Zhou,Rayan Saab
出处
期刊:SIAM journal on mathematics of data science [Society for Industrial and Applied Mathematics]
卷期号:5 (2): 373-399 被引量:4
标识
DOI:10.1137/22m1511709
摘要

While neural networks have been remarkably successful in a wide array of applications, implementing them in resource-constrained hardware remains an area of intense research. By replacing the weights of a neural network with quantized (e.g., 4-bit, or binary) counterparts, massive savings in computation cost, memory, and power consumption are attained. To that end, we generalize a post-training neural network quantization method, GPFQ, that is based on a greedy path-following mechanism. Among other things, we propose modifications to promote sparsity of the weights, and rigorously analyze the associated error. Additionally, our error analysis expands the results of previous work on GPFQ to handle general quantization alphabets, showing that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights, i.e., level of overparametrization. Our result holds across a range of input distributions and for both fully connected and convolutional architectures thereby also extending previous results. To empirically evaluate the method, we quantize several common architectures with few bits per weight, and test them on ImageNet, showing only minor loss of accuracy compared to unquantized models. We also demonstrate that standard modifications, such as bias correction and mixed precision quantization, further improve accuracy.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
shows发布了新的文献求助10
1秒前
SOLOMON应助互助遵法尚德采纳,获得10
4秒前
5秒前
8秒前
kiki134发布了新的文献求助30
9秒前
9秒前
hyd1640完成签到,获得积分10
11秒前
LiuJinhui发布了新的文献求助10
13秒前
panfan发布了新的文献求助10
15秒前
去码头整点薯条完成签到,获得积分10
16秒前
淡淡的向雁完成签到,获得积分10
20秒前
23秒前
魔法签证1993完成签到,获得积分10
23秒前
26秒前
宜城发布了新的文献求助10
28秒前
Longxingsu发布了新的文献求助30
30秒前
33秒前
大模型应助宜城采纳,获得10
35秒前
大模型应助白泽采纳,获得10
35秒前
36秒前
XiaotianLiu完成签到,获得积分10
36秒前
ASIS完成签到,获得积分10
37秒前
鬼鬼的眼睛完成签到,获得积分10
38秒前
39秒前
40秒前
宜城完成签到,获得积分10
43秒前
s33发布了新的文献求助10
46秒前
Safety_Zhang应助科研通管家采纳,获得30
47秒前
shinysparrow应助科研通管家采纳,获得10
47秒前
shinysparrow应助科研通管家采纳,获得10
47秒前
FIN应助科研通管家采纳,获得30
48秒前
48秒前
李爱国应助科研通管家采纳,获得10
48秒前
可爱迪应助科研通管家采纳,获得10
48秒前
写给流浪完成签到,获得积分10
48秒前
shinysparrow应助科研通管家采纳,获得10
48秒前
FIN应助科研通管家采纳,获得30
48秒前
英姑应助科研通管家采纳,获得10
48秒前
48秒前
隐形曼青应助科研通管家采纳,获得10
48秒前
高分求助中
请在求助之前详细阅读求助说明!!!! 20000
One Man Talking: Selected Essays of Shao Xunmei, 1929–1939 1000
The Three Stars Each: The Astrolabes and Related Texts 900
Yuwu Song, Biographical Dictionary of the People's Republic of China 700
[Lambert-Eaton syndrome without calcium channel autoantibodies] 520
Bernd Ziesemer - Maos deutscher Topagent: Wie China die Bundesrepublik eroberte 500
A radiographic standard of reference for the growing knee 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2471615
求助须知:如何正确求助?哪些是违规求助? 2138131
关于积分的说明 5448443
捐赠科研通 1862080
什么是DOI,文献DOI怎么找? 926040
版权声明 562747
科研通“疑难数据库(出版商)”最低求助积分说明 495308