量化(信号处理)
计算机科学
内存占用
人工神经网络
修剪
杠杆(统计)
人工智能
深层神经网络
机器学习
算法
模式识别(心理学)
农学
生物
操作系统
作者
Luis María López Guerra,Bohan Zhuang,Ian R. Reid,Tom Drummond
出处
期刊:arXiv: Computer Vision and Pattern Recognition
日期:2020-02-03
被引量:3
摘要
Neural network quantization and pruning are two techniques commonly used to reduce the computational complexity and memory footprint of these models for deployment. However, most existing pruning strategies operate on full-precision and cannot be directly applied to discrete parameter distributions after quantization. In contrast, we study a combination of these two techniques to achieve further network compression. In particular, we propose an effective pruning strategy for selecting redundant low-precision filters. Furthermore, we leverage Bayesian optimization to efficiently determine the pruning ratio for each layer. We conduct extensive experiments on CIFAR-10 and ImageNet with various architectures and precisions. In particular, for ResNet-18 on ImageNet, we prune 26.12% of the model size with Binarized Neural Network quantization, achieving a top-1 classification accuracy of 47.32% in a model of 2.47 MB and 59.30% with a 2-bit DoReFa-Net in 4.36 MB.
科研通智能强力驱动
Strongly Powered by AbleSci AI