计算机科学
编码器
分割
人工智能
图像分割
变压器
模式识别(心理学)
数据挖掘
遥感
计算机视觉
量子力学
操作系统
物理
地质学
电压
作者
Xiao‐Qing Zhu,Taile Peng,Xiaobin Hu,Jia Guo,Taotao Cao,Hao Wang
标识
DOI:10.1117/1.jrs.17.026512
摘要
Accurately identifying the semantic information of complex objects is a challenging problem in semantic segmentation of remote sensing images. We propose a bi-encoder network for semantic segmentation of complex targets, called the SN-Unetformer. It combines ConvNeXt and Swin Transformer into a bi-encoder and constructs a feature fusion module (FFM) to fully integrate the semantic information of the bi-encoder by exploiting channel dependence. Moreover, an efficient attention mechanism has been introduced to model the global–local relationship. To the best of our knowledge, our proposed network is innovative, as it is the first method to combine two popular networks, ConvNeXt and the Swin Transformer, into a dual encoder. Our SN-Unetformer has been tested on large-scale Vaihingen and Potsdam datasets, as well as the LoveDA dataset, with significant challenges. Compared to current advanced methods for semantic segmentation for remote sensing images, our accuracy is significantly better. In particular, our method achieves 84.3% of mean intersection over union on the Vaihingen dataset, which is the best result currently available for this dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI