计算机科学
语义学(计算机科学)
特征(语言学)
特征提取
分割
人工智能
图像分割
遥感
计算机视觉
模式识别(心理学)
语义特征
遥感应用
图像(数学)
语义映射
作者
Wenliang Du,Shuo Tang,Jiaqi Zhao,Rui Yao,Yong Zhou
标识
DOI:10.1109/lgrs.2025.3623603
摘要
The “Others” category in multimodal remote sensing images is characterized by high intra-class variability. Therefore, existing lightweight semantic segmentation models struggle with this category due to limitations in capturing both local details and global dependencies efficiently. We propose MoViM, a lightweight model that integrates a hybrid Vision Mamba (ViM) and CNN backbone to capture global contextual information and local details effectively. In addition, the MoViM also features an Inverted Stem for efficient multimodal fusion, a Global Semantics Extraction (GSE) module for enhanced global feature representation, and a Global-Local Feature Fusion (GLF) module for context-aware feature integration. Extensive experiments on WHU-OPT-SAR and Potsdam datasets demonstrate that MoViM achieves state-of-the-art performance, particularly in the “Others” category, while maintaining low computational complexity. Our codes are available at https://github.com/WenliangDu/MoViM.
科研通智能强力驱动
Strongly Powered by AbleSci AI