计算机科学
变压器
编码器
分割
稳健性(进化)
人工智能
算法
模式识别(心理学)
计算机工程
工程类
电压
生物化学
化学
电气工程
基因
操作系统
作者
Enze Xie,Wenhai Wang,Zhiding Yu,Anima Anandkumar,José M. Alvarez,Ping Luo
出处
期刊:Cornell University - arXiv
日期:2021-05-31
被引量:3065
标识
DOI:10.48550/arxiv.2105.15203
摘要
We present SegFormer, a simple, efficient yet powerful semantic segmentation\nframework which unifies Transformers with lightweight multilayer perception\n(MLP) decoders. SegFormer has two appealing features: 1) SegFormer comprises a\nnovel hierarchically structured Transformer encoder which outputs multiscale\nfeatures. It does not need positional encoding, thereby avoiding the\ninterpolation of positional codes which leads to decreased performance when the\ntesting resolution differs from training. 2) SegFormer avoids complex decoders.\nThe proposed MLP decoder aggregates information from different layers, and thus\ncombining both local attention and global attention to render powerful\nrepresentations. We show that this simple and lightweight design is the key to\nefficient segmentation on Transformers. We scale our approach up to obtain a\nseries of models from SegFormer-B0 to SegFormer-B5, reaching significantly\nbetter performance and efficiency than previous counterparts. For example,\nSegFormer-B4 achieves 50.3% mIoU on ADE20K with 64M parameters, being 5x\nsmaller and 2.2% better than the previous best method. Our best model,\nSegFormer-B5, achieves 84.0% mIoU on Cityscapes validation set and shows\nexcellent zero-shot robustness on Cityscapes-C. Code will be released at:\ngithub.com/NVlabs/SegFormer.\n
科研通智能强力驱动
Strongly Powered by AbleSci AI