规范化(社会学)
计算机科学
人工智能
卷积神经网络
模式识别(心理学)
转化(遗传学)
块(置换群论)
视觉对象识别的认知神经科学
分割
频道(广播)
上下文图像分类
目标检测
计算复杂性理论
特征提取
图像(数学)
算法
数学
计算机网络
生物化学
化学
几何学
社会学
人类学
基因
作者
Zongxin Yang,Linchao Zhu,Yu Wu,Yi Yang
标识
DOI:10.1109/cvpr42600.2020.01181
摘要
In this work, we propose a generally applicable transformation unit for visual recognition with deep convolutional neural networks. This transformation explicitly models channel relationships with explainable control variables. These variables determine the neuron behaviors of competition or cooperation, and they are jointly optimized with the convolutional weight towards more accurate recognition. In Squeeze-and-Excitation (SE) Networks, the channel relationships are implicitly learned by fully connected layers, and the SE block is integrated at the block-level. We instead introduce a channel normalization layer to reduce the number of parameters and computational complexity. This lightweight layer incorporates a simple l2 normalization, enabling our transformation unit applicable to operator-level without much increase of additional parameters. Extensive experiments demonstrate the effectiveness of our unit with clear margins on many vision tasks, i.e., image classification on ImageNet, object detection and instance segmentation on COCO, video classification on Kinetics.
科研通智能强力驱动
Strongly Powered by AbleSci AI