计算机科学
编码器
分割
变压器
人工智能
增采样
计算机视觉
图像分割
遥感
图像(数学)
地理
工程类
电压
电气工程
操作系统
作者
Lili Fan,Yu Zhou,Hongmei Li,Yunjie Li,Dongpu Cao
出处
期刊:IEEE Transactions on Geoscience and Remote Sensing
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:61: 1-11
标识
DOI:10.1109/tgrs.2023.3329152
摘要
Remote sensing semantic segmentation plays a significant role in various applications such as environmental monitoring, land use planning, and disaster response. CNNs have been dominating remote sensing semantic segmentation. However, due to the limitations of convolution operations, CNNs cannot effectively model global context. The success of Transformers in the NLP domain provides a new solution for global context modeling. Inspired by Swin Transformer, we propose a novel remote sensing semantic segmentation model called CSTUNet. This model employs a dual-encoder structure consisting of a CNN-based main encoder and a Swin Transformer-based auxiliary encoder. We first utilize a detail-structure preservation module (DPM) to mitigate the loss of detail and structure information caused by Swin Transformer downsampling. Then we introduce a spatial feature enhancement module (SFE) to collect contextual information from different spatial dimensions. Finally, we construct a position-aware attention fusion module (PAFM) to fuse contextual and local information. Our proposed model obtained 70.75% MIoU on the ISPRS-Vaihingen dataset and 77.27% MIoU on the ISPRS-Potsdam dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI