计算机科学
编码器
变压器
卷积神经网络
人工智能
增采样
计算机视觉
模式识别(心理学)
图像(数学)
电压
工程类
电气工程
操作系统
作者
Cui Zhang,Liejun Wang,Shuli Cheng,Yongming Li
标识
DOI:10.1109/tgrs.2022.3160007
摘要
Convolutional neural network (CNN) can extract effective semantic features, so it was widely used for remote sensing image change detection (CD) in the latest years. CNN has acquired great achievements in the field of CD, but due to the intrinsic locality of convolution operation, it could not capture global information in space-time. The transformer was proposed in recent years and it can effectively extract global information, so it was used to solve computer vision (CV) tasks and achieved amazing success. In this article, we design a pure transformer network with Siamese U-shaped structure to solve CD problems and name it SwinSUNet. SwinSUNet contains encoder, fusion, and decoder, and all of them use Swin transformer blocks as basic units. Encoder has a Siamese structure based on hierarchical Swin transformer, so encoder can process bitemporal images in parallel and extract their multiscale features. Fusion is mainly responsible for the merge operation of the bitemporal features generated by the encoder. Like encoder, the decoder is also based on hierarchical Swin transformer. Different from the encoder, the decoder uses upsampling and merging (UM) block and Swin transformer blocks to recover the details of the change information. The encoder uses patch merging and Swin transformer blocks to generate effective semantic features. After the sequential process of these three modules, SwinSUNet will output the change maps. We did expensive experiments on four CD datasets, and in these experiments, SwinSUNet achieved better results than other related methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI