Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

变压器 计算机科学 人工智能 分割 计算 像素 建筑 计算机视觉 图像分割 算法 电压 工程类 电气工程 艺术 视觉艺术
作者
Ze Liu,Yutong Lin,Yue Cao,Han Hu,Yixuan Wei,Zheng Zhang,Stephen Lin,Baining Guo
标识
DOI:10.1109/iccv48922.2021.00986
摘要

This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision. Challenges in adapting Transformer from language to vision arise from differences between the two domains, such as large variations in the scale of visual entities and the high resolution of pixels in images compared to words in text. To address these differences, we propose a hierarchical Transformer whose representation is computed with Shifted windows. The shifted windowing scheme brings greater efficiency by limiting self-attention computation to non-overlapping local windows while also allowing for cross-window connection. This hierarchical architecture has the flexibility to model at various scales and has linear computational complexity with respect to image size. These qualities of Swin Transformer make it compatible with a broad range of vision tasks, including image classification (87.3 top-1 accuracy on ImageNet-1K) and dense prediction tasks such as object detection (58.7 box AP and 51.1 mask AP on COCO test-dev) and semantic segmentation (53.5 mIoU on ADE20K val). Its performance surpasses the previous state-of-the-art by a large margin of +2.7 box AP and +2.6 mask AP on COCO, and +3.2 mIoU on ADE20K, demonstrating the potential of Transformer-based models as vision backbones. The hierarchical design and the shifted window approach also prove beneficial for all-MLP architectures. The code and models are publicly available at https://github.com/microsoft/Swin-Transformer.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
1秒前
ANKAR发布了新的文献求助10
2秒前
迟迟发布了新的文献求助10
2秒前
张宏哲完成签到,获得积分10
3秒前
共享精神应助尔尔采纳,获得10
4秒前
你也可以学好英语完成签到,获得积分10
6秒前
海蓝云天应助王秋婷采纳,获得10
6秒前
跳跃沛儿完成签到,获得积分10
6秒前
婷刘发布了新的文献求助10
6秒前
科研通AI6.4应助haochengshen采纳,获得10
7秒前
大模型应助Simon采纳,获得10
8秒前
Akim应助轻念采纳,获得10
8秒前
8秒前
11秒前
molihuakai应助UN采纳,获得10
11秒前
lkw完成签到,获得积分10
11秒前
12秒前
14秒前
ANKAR发布了新的文献求助10
14秒前
14秒前
无私的妍完成签到 ,获得积分10
14秒前
吴七七发布了新的文献求助10
16秒前
婷刘完成签到,获得积分10
17秒前
小满完成签到,获得积分10
17秒前
hefang完成签到,获得积分10
18秒前
ANKAR完成签到,获得积分10
19秒前
轻念发布了新的文献求助10
19秒前
爱笑完成签到,获得积分10
22秒前
25秒前
25秒前
如意雁兰发布了新的文献求助50
25秒前
深情安青应助啊啊啊采纳,获得10
26秒前
熬夜冠军发布了新的文献求助10
26秒前
27秒前
30秒前
CHI发布了新的文献求助10
30秒前
33秒前
熬夜冠军完成签到,获得积分10
38秒前
陈登完成签到 ,获得积分10
39秒前
高分求助中
Principles of Economics, 11th Edition 10000
Prescott's Microbiology: 2026 Release ISE 10000
University Physics with Modern Physics, 16th edition 10000
Cronologia da história de Macau 5000
Environmental Leverage in Times of Climate Crisis: Product Standards, Carbon Border Measures and Preferential Trade Agreements 1000
Interactions of Vowel Quality and Prosody in East Slavic 1000
Matrix Methods in Data Mining and Pattern Recognition 510
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 内科学 物理 复合材料 催化作用 细胞生物学 无机化学 光电子学 物理化学 电极 基因
热门帖子
关注 科研通微信公众号,转发送积分 7157746
求助须知:如何正确求助?哪些是违规求助? 8802014
关于积分的说明 18600819
捐赠科研通 6759644
什么是DOI,文献DOI怎么找? 3162061
关于科研通互助平台的介绍 2297348
邀请新用户注册赠送积分活动 2136745