计算机科学
现场可编程门阵列
建筑
延迟(音频)
计算机体系结构
软件部署
目标检测
低延迟(资本市场)
多核
嵌入式系统
计算机硬件
人工智能
并行计算
模式识别(心理学)
操作系统
计算机网络
电信
艺术
视觉艺术
作者
Yu Zhang,Christos-Savvas Bouganis
标识
DOI:10.1007/978-3-030-44534-8_25
摘要
Object detection is the task of detecting the position of objects in an image or video as well as their corresponding class. The current state of the art approach that achieves the highest performance (i.e. fps) without significant penalty in accuracy of detection is the YOLO framework, and more specifically its latest version YOLOv3. When embedded systems are targeted for deployment, YOLOv3-tiny, a lightweight version of YOLOv3, is usually adopted. The presented work is the first to implement a parameterised FPGA-tailored architecture specifically for YOLOv3-tiny. The architecture is optimised for latency-sensitive applications, and is able to be deployed in low-end devices with stringent resource constraints. Experiments demonstrate that when a low-end FPGA device is targeted, the proposed architecture achieves a 290x improvement in latency, compared to the hard core processor of the device, achieving at the same time a reduction in mAP of 2.5 pp (30.9% vs 33.4%) compared to the original model. The presented work opens the way for low-latency object detection on low-end FPGA devices.
科研通智能强力驱动
Strongly Powered by AbleSci AI