分割
计算机科学
衬垫
职位(财务)
计算机视觉
人工智能
编码(内存)
变压器
遥感
地理
计算机安全
物理
财务
量子力学
电压
经济
作者
Yi Yan,Jing Zhang,Xinjia Wu,Jiafeng Li,Zhuo Li
标识
DOI:10.1080/01431161.2023.2299276
摘要
Semantic segmentation of remote sensing images (RSIs) is of great significance for obtaining geospatial object information. Transformers win promising effect, whereas multi-head self-attention (MSA) is expensive. We propose an efficient semantic segmentation Transformer (ESST) of RSIs that combines zero-padding position encoding with linear space reduction attention (LSRA). First, to capture the coarse-to-fine features of RSI, a zero-padding position encoding is proposed by adding overlapping patch embedding (OPE) layers and convolution feed-forward networks (CFFN) to improve the local continuity of features. Then, we replace LSRA in the attention operation to extract multi-level features to reduce the computational cost of the encoder. Finally, we design a lightweight all multi-layer perceptron (all-MLP) head decoder to easily aggregate multi-level features to generate multi-scale features for semantic segmentation. Experimental results demonstrate that our method produces a trade-off in accuracy and speed for semantic segmentation of RSIs on the Potsdam and Vaihingen datasets, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI