卷积神经网络
深度学习
内卷(密宗)
人工智能
计算机科学
分割
卷积(计算机科学)
模式识别(心理学)
操作员(生物学)
人工神经网络
算法
心理学
生物化学
转录因子
基因
抑制因子
神经科学
化学
意识
作者
Duo Li,Jie Hu,Changhu Wang,Xiangtai Li,Qi She,Lei Zhu,Tong Zhang,Qifeng Chen
出处
期刊:Cornell University - arXiv
日期:2021-01-01
被引量:13
标识
DOI:10.48550/arxiv.2103.06255
摘要
Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. In this work, we rethink the inherent principles of standard convolution for vision tasks, specifically spatial-agnostic and channel-specific. Instead, we present a novel atomic operation for deep neural networks by inverting the aforementioned design principles of convolution, coined as involution. We additionally demystify the recent popular self-attention operator and subsume it into our involution family as an over-complicated instantiation. The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition, powering different deep learning models on several prevalent benchmarks, including ImageNet classification, COCO detection and segmentation, together with Cityscapes segmentation. Our involution-based models improve the performance of convolutional baselines using ResNet-50 by up to 1.6% top-1 accuracy, 2.5% and 2.4% bounding box AP, and 4.7% mean IoU absolutely while compressing the computational cost to 66%, 65%, 72%, and 57% on the above benchmarks, respectively. Code and pre-trained models for all the tasks are available at https://github.com/d-li14/involution.
科研通智能强力驱动
Strongly Powered by AbleSci AI