计算机科学
人工智能
机器学习
卷积神经网络
计算
移动设备
资源(消歧)
领域(数学)
加速度
人工神经网络
深度学习
计算机工程
算法
物理
经典力学
计算机网络
数学
纯数学
操作系统
作者
Tejalal Choudhary,Vipul Kumar Mishra,Anurag Goswami,S. Jagannathan
标识
DOI:10.1007/s10462-020-09816-7
摘要
In recent years, machine learning (ML) and deep learning (DL) have shown remarkable improvement in computer vision, natural language processing, stock prediction, forecasting, and audio processing to name a few. The size of the trained DL model is large for these complex tasks, which makes it difficult to deploy on resource-constrained devices. For instance, size of the pre-trained VGG16 model trained on the ImageNet dataset is more than 500 MB. Resource-constrained devices such as mobile phones and internet of things devices have limited memory and less computation power. For real-time applications, the trained models should be deployed on resource-constrained devices. Popular convolutional neural network models have millions of parameters that leads to increase in the size of the trained model. Hence, it becomes essential to compress and accelerate these models before deploying on resource-constrained devices while making the least compromise with the model accuracy. It is a challenging task to retain the same accuracy after compressing the model. To address this challenge, in the last couple of years many researchers have suggested different techniques for model compression and acceleration. In this paper, we have presented a survey of various techniques suggested for compressing and accelerating the ML and DL models. We have also discussed the challenges of the existing techniques and have provided future research directions in the field.
科研通智能强力驱动
Strongly Powered by AbleSci AI