期刊:IEEE Internet of Things Journal [Institute of Electrical and Electronics Engineers] 日期:2023-08-18卷期号:10 (23): 20703-20717被引量:4
标识
DOI:10.1109/jiot.2023.3306435
摘要
Three-dimensional (3-D) understanding or inference has received increasing attention, where 3-D convolutional neural networks (3D-CNNs) have demonstrated superior performance compared to 2D-CNNs, since 3D-CNNs learn features from all three dimensions. However, 3D-CNNs suffer from intensive computation and data movement. In this article, Sagitta, an energy-efficient low-latency on-chip 3D-CNN accelerator, is proposed for edge devices. Locality and small differential value dropout are leveraged to increase the sparsity of activations. A full-zero-skipping convolutional microarchitecture is proposed to fully utilize the sparsity of weights and activations. A hierarchical load-balancing scheme is also introduced to increase the hardware utilization. Specialized architecture and computation flow are proposed to enhance the effectiveness of the proposed techniques. Fabricated in a 55-nm CMOS technology, Sagitta achieves 3.8 TOPS/W for C3D at a latency of 0.1 s and 4.5 TOPS/W for 3D U-Net at a latency of 0.9 s at 100 MHz and 0.91-V supply voltage. Compared to the state-of-the-art 3D-CNN and 2D-CNN accelerators, Sagitta enhances the energy efficiency by up to $379.6\times $ and $11\times $ , respectively.