计算机科学
软件部署
分布式计算
GSM演进的增强数据速率
边缘计算
修剪
适应性
计算机体系结构
人工智能
软件工程
生态学
农学
生物
标识
DOI:10.1002/9781394219230.ch4
摘要
Chapter 4 delves into various model optimization techniques crucial for deploying AI models on edge devices such as smartphones, smartwatches, and IoT devices. These optimizations are categorized into three phases: predeployment, deployment-time, and postdeployment. Predeployment techniques include model architecture selection, quantization, structured pruning, knowledge distillation, and sparsification, which are applied to the model before production to enhance performance and efficiency. Deployment-time techniques, such as IR conversion, graph optimizations, target-dependent optimizations, dynamic batching, model caching, and model parallelism, are employed to optimize models during deployment and runtime. Postdeployment techniques, including model monitoring, retraining, hardware upgrades, and user feedback loops, ensure continuous performance improvement and adaptability of models in real-world scenarios. Through illustrative examples, this chapter provides a comprehensive understanding of how these optimization strategies can be effectively implemented to meet the constraints and requirements of edge computing.
科研通智能强力驱动
Strongly Powered by AbleSci AI