失败
计算机科学
延迟(音频)
并行计算
还原(数学)
吞吐量
计算
编码(集合论)
浮点型
卷积(计算机科学)
人工神经网络
深层神经网络
计算机工程
算法
人工智能
操作系统
几何学
集合(抽象数据类型)
程序设计语言
无线
数学
电信
作者
Jierun Chen,Shiu-hong Kao,Hao He,Weipeng Zhuo,Wen Song,Chul‐Ho Lee,S.-H. Gary Chan
标识
DOI:10.1109/cvpr52729.2023.01157
摘要
To design fast neural networks, many works have been focusing on reducing the number of floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does not necessarily lead to a similar level of re-duction in latency. This mainly stems from inefficiently low floating-point operations per second (FLOPS). To achieve faster networks, we revisit popular operators and demonstrate that such low FLOPS is mainly due to frequent memory access of the operators, especially the depthwise con-volution. We hence propose a novel partial convolution (PConv) that extracts spatial features more efficiently, by cutting down redundant computation and memory access simultaneously. Building upon our PConv, we further propose FasterNet, a new family of neural networks, which attains substantially higher running speed than others on a wide range of devices, without compromising on accuracy for various vision tasks. For example, on ImageNet-lk, our tiny FasterNet-TO is 2.8×, 3.3×, and 2.4× faster than MobileViT-XXS on GPU, CPU, and ARM processors, respectively, while being 2.9% more accurate. Our large FasterNet-L achieves impressive 83.5% top-1 accuracy, on par with the emerging Swin-B, while having 36% higher inference throughput on GPU, as well as saving 37% compute time on CPU. Code is available at https://github.com/JierunChen/FasterNet.
科研通智能强力驱动
Strongly Powered by AbleSci AI