计算机科学
图形
分割
卷积神经网络
融合
人工智能
模式识别(心理学)
计算机视觉
理论计算机科学
语言学
哲学
作者
Yanfeng Zhao,Linwei Qiu,Zhenjian Yang,Yadong Chen,Yunjie Zhang
标识
DOI:10.1016/j.inffus.2025.103150
摘要
With the continuous improvement of the spatial resolution of remote sensing images, the application of semantic segmentation to high-resolution remote sensing data has made significant progress. However, due to the limited information provided by unimodal data and the similarity or confusion between different types of objects in complex scenes, the performance of unimodal segmentation models remains limited. To address these challenges, this paper proposes a multimodal semantic segmentation network that combines visible light data (RGB/IRRG) with digital surface model (DSM) data to provide richer object information . We employ different backbone networks to process visible light and DSM data separately and design a Height-Aware Graph Convolution (HAGC) strategy to effectively capture the spatial correlations between the two modalities. Additionally, we introduce a Multimodal Hierarchical Interaction Mamba (MHIMamba) module to fuse and process features from different modalities, achieving feature complementarity and enhancing segmentation performance . Finally, we apply a Progressive Context Cascade Decoder (PCCD) to recover spatial details . This work pioneers the integration of the Mamba into multimodal semantic segmentation, facilitating effective cross-modal feature interaction and improving segmentation accuracy in remote sensing imagery . Experimental results demonstrate that our model achieves state-of-the-art segmentation performance on the Potsdam and Vaihingen datasets. Our code is available at https://github.com/zyf-cell/MGF-GCN . • DGCN backbone with Height-Aware Graph Convolution boosts height perception. • MHIMamba fuses multimodal features in shared space for better performance. • Progressive Context Cascade Decoder refines multilevel features.
科研通智能强力驱动
Strongly Powered by AbleSci AI