现场可编程门阵列
计算机科学
嵌入式系统
深度学习
能源消耗
多样性(控制论)
硬件加速
目标检测
计算机硬件
人工智能
计算机体系结构
模式识别(心理学)
工程类
电气工程
作者
Yufan Lu,Xiaojun Zhai,Sangeet Saha,Shoaib Ehsan,Klaus D. McDonald-Maier
标识
DOI:10.1109/mcsoc51149.2021.00038
摘要
Machine learning, and in particular deep learning (DL), has seen strong success in a wide variety of applications, e.g. object detection, image classification and self-driving. However, due to the limitations on hardware resources and power consumption, there are many challenges to deploy deep learning algorithms on resource-constrained mobile and embedded systems, especially for systems running multiple DL algorithms for a variety of tasks. In this paper, an adaptive hardware resource management system, implemented on field-programmable gate arrays (FPGAs), is proposed to dynamically manage the on-chip hardware resources (e.g. LUTs, BRAMs and DSPs) to adapt to a variety of tasks. Using dynamic function exchange (DFX) technology, the system can dynamically allocate hardware resources to deploy deep learning units (DPUs) so as to balance the requirements, performance and power consumption of the deep learning applications. The prototype is implemented on the Xilinx Zynq UltraScale+ series chips. The experiment results indicate that the proposed scheme significantly improves the computing efficiency of the resource-constrained systems under various experimental scenarios. Compared to the baseline, the proposed strategy consumes 38% and 82% of power in low working load cases and high working load cases, respectively. Typically, the proposed system can save approximately 75.8% of energy.
科研通智能强力驱动
Strongly Powered by AbleSci AI