计算机科学
图像分割
人工智能
建筑
分割
计算机视觉
遥感
图像(数学)
模式识别(心理学)
地质学
地理
考古
作者
Jinbo Zhang,Min Chen,Yitao Zhao,Lianlei Shan,Changzhi Li,Han Hu,Xuming Ge,Qing Zhu,Bo Xu
标识
DOI:10.1109/tgrs.2025.3589552
摘要
Large-size remote sensing images contain rich geographical information. Efficient and accurate semantic segmentation of these images is of significant importance in various fields. However, the massive memory requirements have hindered the development of semantic segmentation methods for large-size remote sensing images. Most existing methods struggle to balance memory usage, global modeling, and local representation accuracy. To address these issues, we propose a new semantic segmentation method for large-size remote sensing images, Mamba–CNN parallel network (MCPNet), which demonstrates impressive performance. The method is an asymmetric Mamba–convolutional neural network (CNN) hybrid architecture. Given the linear modeling complexity of Mamba, we construct the M-branch based on the visual state space (VSS) model, which processes downsampled images to reduce memory consumption while alleviating Mamba’s local forgetting problem. To further enhance the model’s capability in fine-grained detail extraction, we meticulously design a detail-preserving network (DPN) as the C-branch. This branch employs a split downsampling strategy and multiscale convolutional kernel groups to process large-size images, ensuring the preservation of spatial positional relationships while capturing fine-grained local details. Moreover, to effectively filter redundant information introduced by large-size images and bridge the semantic gap between the features extracted by CNN and Mamba, we propose a multigated feature fusion module (MG-FFM). This module progressively refines heterogeneous feature alignment through a bottom-up hierarchical refinement strategy, achieving a progressive fusion of semantics and details. Our method achieves state-of-the-art (SOTA) performance in terms of mean intersection over union (mIoU) and mF1 score on the self-constructed Yaan UAV dataset and two widely used public datasets (DeepGlobe and Inria Aerial) while consuming less GPU memory. The codes will be available at https://github.com/fsqy-zhang/MCPNet
科研通智能强力驱动
Strongly Powered by AbleSci AI