MPQ-YOLO: Ultra low mixed-precision quantization of YOLO for edge devices deployment

量化(信号处理) 计算机科学 边缘设备 算法 软件部署 计算机工程 实时计算 人工智能 云计算 操作系统
作者
Xinyu Liu,Tao Wang,Jiaming Yang,Chenwei Tang,Jiancheng Lv
出处
期刊:Neurocomputing [Elsevier BV]
卷期号:574: 127210-127210 被引量:2
标识
DOI:10.1016/j.neucom.2023.127210
摘要

You Only Look Once (YOLO), known for its real-time performance and outstanding accuracy, has emerged as a prominent framework for object detection tasks. However, deploying YOLO on resource-constrained edge devices poses challenges due to its substantial memory requirements. In this paper, we propose MPQ-YOLO, an ultra-low mixed-precision quantization framework designed for edge device deployment. The core idea is to integrate 1-bit Backbone quantization and 4-bit Head quantization with dedicated training techniques. Specifically, we analyze the effect of numerical distribution on the performance of binary neural networks (BNNs), and based on this, we design a backbone with only 1-bit convolution. Then, we introduce a trainable scale and Progressive Network Quantization (PNQ) training strategy to bridge the Backbone and Head for end-to-end quantization training. The former is applied to both weights and activations within the 4-bit Head, enabling effective gradient propagation. The latter mitigates oscillation caused by mixed precision training, promoting smoother training and faster model convergence. Extensive experiments on VOC and COCO datasets demonstrate that MPQ-YOLO achieves a good trade-off between model compression and detection performance. Specifically, compared to the full-precision model, MPQ-YOLO achieves compression of up to 16.3× and 14.2× in terms of computational complexity and model size, respectively, while maintaining relatively high detection accuracy, i.e., 74.7% on VOC and 51.5% on COCO. To the best of our knowledge, MPQ-YOLO is the first YOLO framework with dual low mixed-precision quantization. Moreover, compared to the existing layer-wise mixed-precision quantization methods which cause redundant data processing and massive data movement, MPQ-YOLO offers a more hardware-designer-friendly and straightforward solution through efficient resource utilization and reuse.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
悦耳怜珊完成签到,获得积分10
刚刚
xiaojiahuo发布了新的文献求助10
刚刚
1秒前
1秒前
2秒前
Loong完成签到,获得积分10
2秒前
昏睡的蟠桃应助wendyhaohao采纳,获得100
3秒前
3秒前
可靠的老鼠完成签到,获得积分10
3秒前
十五发布了新的文献求助50
3秒前
李健应助而别采纳,获得10
4秒前
简单发布了新的文献求助10
4秒前
怕孤独的亦云完成签到,获得积分10
4秒前
4秒前
琪琪的发布了新的文献求助10
4秒前
5秒前
汉堡包应助依云矿泉水采纳,获得50
5秒前
5秒前
dlm发布了新的文献求助10
5秒前
6秒前
6秒前
6秒前
Orange应助long采纳,获得10
7秒前
czssz完成签到,获得积分10
7秒前
8秒前
852应助哈哈哈采纳,获得30
9秒前
清漪发布了新的文献求助10
9秒前
漫漫楚威风完成签到 ,获得积分10
9秒前
Jorna发布了新的文献求助20
9秒前
无花果应助包容新蕾采纳,获得10
10秒前
10秒前
明理的小蜜蜂完成签到,获得积分10
10秒前
10秒前
呆萌惜梦发布了新的文献求助10
11秒前
儒雅醉冬完成签到,获得积分10
11秒前
zzx发布了新的文献求助10
11秒前
荆振州发布了新的文献求助10
12秒前
Ava应助Bagpipe采纳,获得10
12秒前
wsl发布了新的文献求助10
12秒前
高分求助中
Thinking Small and Large 500
Algorithmic Mathematics in Machine Learning 500
Getting Published in SSCI Journals: 200+ Questions and Answers for Absolute Beginners 300
Treatise on Ocular Drug Delivery 200
studies in large plastic flow and fructure 200
New Syntheses with Carbon Monoxide 200
Quanterion Automated Databook NPRD-2023 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3834697
求助须知:如何正确求助?哪些是违规求助? 3377202
关于积分的说明 10497023
捐赠科研通 3096605
什么是DOI,文献DOI怎么找? 1705084
邀请新用户注册赠送积分活动 820451
科研通“疑难数据库(出版商)”最低求助积分说明 772054