Omni-Dimensional Dynamic Convolution

核（代数）卷积神经网络卷积（计算机科学）计算机科学推论人工智能维数（图论）卷积码深度学习特征（语言学）模式识别（心理学）算法数学人工神经网络解码方法离散数学语言学哲学纯数学

作者

Chao Li,Aojun Zhou,Anbang Yao

出处

期刊：Cornell University - arXiv 日期：2022-01-01 被引量：4

链接

arxiv.org arxiv.org arxiv.org datacite.orgdoi.org

标识

DOI：10.48550/arxiv.2209.07947

摘要

Learning a single static convolutional kernel in each convolutional layer is the common training paradigm of modern Convolutional Neural Networks (CNNs). Instead, recent research in dynamic convolution shows that learning a linear combination of $n$ convolutional kernels weighted with their input-dependent attentions can significantly improve the accuracy of light-weight CNNs, while maintaining efficient inference. However, we observe that existing works endow convolutional kernels with the dynamic property through one dimension (regarding the convolutional kernel number) of the kernel space, but the other three dimensions (regarding the spatial size, the input channel number and the output channel number for each convolutional kernel) are overlooked. Inspired by this, we present Omni-dimensional Dynamic Convolution (ODConv), a more generalized yet elegant dynamic convolution design, to advance this line of research. ODConv leverages a novel multi-dimensional attention mechanism with a parallel strategy to learn complementary attentions for convolutional kernels along all four dimensions of the kernel space at any convolutional layer. As a drop-in replacement of regular convolutions, ODConv can be plugged into many CNN architectures. Extensive experiments on the ImageNet and MS-COCO datasets show that ODConv brings solid accuracy boosts for various prevailing CNN backbones including both light-weight and large ones, e.g., 3.77%~5.71%|1.86%~3.72% absolute top-1 improvements to MobivleNetV2|ResNet family on the ImageNet dataset. Intriguingly, thanks to its improved feature learning ability, ODConv with even one single kernel can compete with or outperform existing dynamic convolution counterparts with multiple kernels, substantially reducing extra parameters. Furthermore, ODConv is also superior to other attention modules for modulating the output features or the convolutional weights.

求助该文献

最长约 10秒，即可获得该文献文件

Omni-Dimensional Dynamic Convolution

今日热心研友