加速
计算机科学
还原(数学)
超参数
MNIST数据库
人工智能
深度学习
架空(工程)
压缩比
压缩(物理)
塔克分解
张量(固有定义)
模式识别(心理学)
算法
张量分解
并行计算
数学
几何学
纯数学
复合材料
操作系统
材料科学
工程类
内燃机
汽车工程
作者
Wei Dai,Jicong Fan,Yiming Miao,Kai Hwang
标识
DOI:10.1109/tnnls.2023.3330542
摘要
Large neural network models are hard to deploy on lightweight edge devices demanding large network bandwidth. In this article, we propose a novel deep learning (DL) model compression method. Specifically, we present a dual-model training strategy with an iterative and adaptive rank reduction (RR) in tensor decomposition. Our method regularizes the DL models while preserving model accuracy. With adaptive RR, the hyperparameter search space is significantly reduced. We provide a theoretical analysis of the convergence and complexity of the proposed method. Testing our method for the LeNet, VGG, ResNet, EfficientNet, and RevCol over MNIST, CIFAR-10/100, and ImageNet datasets, our method outperforms the baseline compression methods in both model compression and accuracy preservation. The experimental results validate our theoretical findings. For the VGG-16 on CIFAR-10 dataset, our compressed model has shown a 0.88% accuracy gain with 10.41 times storage reduction and 6.29 times speedup. For the ResNet-50 on ImageNet dataset, our compressed model results in 2.36 times storage reduction and 2.17 times speedup. In federated learning (FL) applications, our scheme reduces 13.96 times the communication overhead. In summary, our compressed DL method can improve the image understanding and pattern recognition processes significantly.
科研通智能强力驱动
Strongly Powered by AbleSci AI