Spatial Pyramid Attention for Deep Convolutional Neural Networks

计算机科学 联营 人工智能 棱锥(几何) 卷积神经网络 水准点(测量) 边距(机器学习) 架空(工程) 计算 特征(语言学) 模式识别(心理学) 利用 深度学习 特征学习 机器学习 算法 哲学 物理 光学 操作系统 语言学 计算机安全 地理 大地测量学
作者
Xu Ma,Jingda Guo,Andrew Sansom,Mara McGuire,Andrew Kalaani,Qi Chen,Sihai Tang,Qing Yang,Song Fu
出处
期刊:IEEE Transactions on Multimedia [Institute of Electrical and Electronics Engineers]
卷期号:23: 3048-3058 被引量:34
标识
DOI:10.1109/tmm.2021.3068576
摘要

Attention mechanisms have shown great success in computer vision. However, the commonly used global average pooling in some implementations aggregates a three-dimensional feature map to a one-dimensional attention map, leading a significant loss of structural information in the attention learning. In this article, we present a novel Spatial Pyramid Attention Network (SPANet), which exploits the structural information and channel relationships for better feature representation. SPANet enhances a base network by adding Spatial Pyramid Attention (SPA) blocks laterally. By rethinking the self-attention mechanism design, we further present three topology structures of attention path connection for our SPANet. They can be flexibly applied to various CNN architectures. SPANet is conceptually simple but practically powerful. It uses both structural regularization and structural information to achieve better learning capability. We have comprehensively evaluated the performance of SPANet on four benchmark datasets for different visual tasks. The experimental results show that SPANet significantly improves the recognition accuracy without adding much computation overhead. Using SPANet, we achieve an improvement of 1.6% top-1 classification accuracy on the ImageNet 2012 benchmark based on ResNet50, and SPANet outperforms SENet and other attention methods. SPANet also significantly improves the object detection performance by a clear margin with negligible additional computation overhead. When applying SPANet to RetinaNet based on the ResNet50 backbone, we improve the performance of the baseline model by 2.3 mAP and the enhanced model outperforms SENet and GCNet by 1.1 mAP and 1.7 mAP respectively. The code of SPANet is made publicly available. 1 [Online]. Available: https://github.com/13952522076/SPANet_TMM
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
lc发布了新的文献求助10
刚刚
derder发布了新的文献求助30
刚刚
huichuanyin完成签到 ,获得积分10
4秒前
4秒前
zho应助橙子采纳,获得10
5秒前
6秒前
多和5的武器完成签到,获得积分10
6秒前
烟花应助lc采纳,获得10
7秒前
大意的映寒完成签到,获得积分20
8秒前
8秒前
9秒前
研友_LNMmW8发布了新的文献求助10
9秒前
每每反完成签到,获得积分10
11秒前
11秒前
12秒前
12秒前
谷雨发布了新的文献求助10
15秒前
16秒前
16秒前
佳佳应助林川采纳,获得10
17秒前
嵇翩跹发布了新的文献求助10
17秒前
BOB发布了新的文献求助10
21秒前
土匪猫完成签到,获得积分10
22秒前
稚久发布了新的文献求助10
22秒前
24秒前
柠檬精翠翠完成签到 ,获得积分10
25秒前
陈俊彰完成签到,获得积分10
29秒前
谷雨完成签到,获得积分10
30秒前
Janina完成签到,获得积分10
38秒前
巧克力coco完成签到,获得积分10
39秒前
曾经康乃馨关注了科研通微信公众号
40秒前
41秒前
秋纳瑞完成签到 ,获得积分10
43秒前
45秒前
45秒前
彪壮的冷霜完成签到,获得积分10
45秒前
吴荣方发布了新的文献求助30
47秒前
48秒前
饕餮完成签到,获得积分10
49秒前
manful完成签到 ,获得积分10
50秒前
高分求助中
Narcissistic Personality Disorder 700
Parametric Random Vibration 600
城市流域产汇流机理及其驱动要素研究—以北京市为例 500
Plasmonics 500
Drug distribution in mammals 500
The Martian climate revisited: atmosphere and environment of a desert planet 500
Building Quantum Computers 458
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3854843
求助须知:如何正确求助?哪些是违规求助? 3397602
关于积分的说明 10602704
捐赠科研通 3119364
什么是DOI,文献DOI怎么找? 1719168
邀请新用户注册赠送积分活动 828098
科研通“疑难数据库(出版商)”最低求助积分说明 777276