现场可编程门阵列
计算机科学
卷积神经网络
块(置换群论)
硬件加速
灵活性(工程)
查阅表格
计算机硬件
GSM演进的增强数据速率
表(数据库)
嵌入式系统
边缘设备
人工智能
操作系统
云计算
几何学
统计
数学
数据挖掘
作者
Xiang Chen,Jindong Li,Yong Zhao
标识
DOI:10.1109/icta53157.2021.9661886
摘要
Convolutional neural networks (CNNs) have become more and more popular in machine vision tasks, including image classification and object detection. As one of the most promising platforms for accelerating CNNs, Field Programmable Gate Array (FPGA) can achieve a better trade-off between speed, flexibility, cost, and power consumption, thus is more suitable in Edge-AI scenarios, which is generally constrained by cost and power. In this paper, by flexibly using line buffer and tiling, we propose an efficient CNN accelerator design based on FPGA. The implementation results indicate the peak performance of the proposed accelerator can reach 172.8 GOP/s, only occupying 24K LUTs (Look-Up-Table), 40.5 BRAMs (Block Ram), and 296 DSP48s, which shows the highest efficiency of hardware resource and computational density compared with previous works.
科研通智能强力驱动
Strongly Powered by AbleSci AI