计算机科学
现场可编程门阵列
卷积神经网络
帕累托原理
水准点(测量)
计算机工程
推论
设计空间探索
硬件加速
光学(聚焦)
人工智能
计算机体系结构
嵌入式系统
计算机硬件
数学优化
光学
物理
数学
地理
大地测量学
作者
Mohamed S. Abdelfattah,Lukasz Dudziak,Thomas C. P. Chau,Royson Lee,Hyeji Kim,Nicholas D. Lane
出处
期刊:Field Programmable Gate Arrays
日期:2020-02-23
被引量:4
标识
DOI:10.1145/3373087.3375334
摘要
Field-programmable gate arrays (FPGAs) have become a popular compute platform for convolutional neural network (CNN) inference; however, the design of a CNN model and its FPGA accelerator has been inherently sequential. A CNN is first prototyped with no-or-little hardware awareness to attain high accuracy; subsequently, an FPGA accelerator is tuned to that specific CNN to maximize its efficiency. Instead, we formulate a neural architecture search (NAS) optimization problem that contains parameters from both the CNN model and the FPGA accelerator, and we jointly search for the best CNN model-accelerator pair that boosts accuracy and efficiency -we call this Codesign-NAS. In this paper we focus on defining the Codesign-NAS multiobjective optimization problem, demonstrating its effectiveness, and exploring different ways of navigating the codesign search space. For Cifar-10 image classification, we enumerate close to 4 billion model-accelerator pairs, and find the Pareto frontier within that large search space. Next we propose accelerator innovations that improve the entire Pareto frontier. Finally, we compare to ResNet on a highly-tuned accelerator, and show that using codesign, we can improve on Cifar-100 classification accuracy by 1.8% while simultaneously increasing performance/area by 41% in just 1000 GPU-hours of running Codesign-NAS, thus demonstrating that our automated codesign approach is superior to sequential design of a CNN model and accelerator.
科研通智能强力驱动
Strongly Powered by AbleSci AI