现场可编程门阵列
计算机科学
延迟(音频)
卷积神经网络
计算
设计空间探索
调度(生产过程)
嵌入式系统
计算机体系结构
计算机硬件
人工智能
算法
运营管理
电信
经济
作者
Petros Toupas,Alexander Montgomerie-Corcoran,Christos-Savvas Bouganis,Dimitrios Tzovaras
标识
DOI:10.1109/fccm57271.2023.00024
摘要
For Human Action Recognition tasks (HAR), 3D Convolutional Neural Networks have proven to be highly effective, achieving state-of-the-art results. This study introduces a novel streaming architecture-based toolflow for mapping such models onto FPGAs considering the model's inherent characteristics and the features of the targeted FPGA device. The HARFLOW3D toolflow takes as input a 3D CNN in ONNX format and a description of the FPGA characteristics, generating a design that minimises the latency of the computation. The toolflow is comprised of a number of parts, including (i) a 3D CNN parser, (ii) a performance and resource model, (iii) a scheduling algorithm for executing 3D models on the generated hardware, (iv) a resource-aware optimisation engine tailored for 3D models, (v) an automated mapping to synthesizable code for FPGAs. The ability of the toolflow to support a broad range of models and devices is shown through a number of experiments on various 3D CNN and FPGA system pairs. Furthermore, the toolflow has produced high-performing results for 3D CNN models that have not been mapped to FPGAs before, demonstrating the potential of FPGA-based systems in this space. Overall, HARFLOW3D has demonstrated its ability to deliver competitive latency compared to a range of state-of-the-art hand-tuned approaches, being able to achieve up to 5× better performance compared to some of the existing works. The tool is available at https://github.com/ptoupas/harflow3d.
科研通智能强力驱动
Strongly Powered by AbleSci AI