遥感
计算机科学
光学(聚焦)
特征(语言学)
隐藏字幕
领域(数学)
推论
编码(集合论)
投影(关系代数)
特征提取
变更检测
人工智能
图像(数学)
解码
遥感应用
灵敏度(控制系统)
像素
图像编辑
词(群论)
模式识别(心理学)
水准点(测量)
差速器(机械装置)
图像分辨率
解码方法
计算机视觉
发电机(电路理论)
空间分析
放射性检测
提取器
网络体系结构
数据挖掘
源代码
实时计算
编码(内存)
特征向量
鉴定(生物学)
条件随机场
频道(广播)
作者
Xincheng Gao,G. Chen,Li’ao Quan,Ping Zhao,Shenghui Liu
标识
DOI:10.1109/lgrs.2025.3646467
摘要
Remote sensing image change captioning (RSICC) aims to extract change information from bi-temporal remote sensing images using natural language, emerging as a new field of research in recent years. Model lightweighting is a crucial approach for facilitate real-world applications. Current studies primarily focus on enhancing model performance, while relatively limited attention has been paid to model lightweighting. To address this problem, we propose a novel lightweight and high-performance framework, SWMCC, which includes: 1) A dual-stream StarNet-based feature extractor for multi-scale feature extraction; 2) A Spatial Relationship Enhancement (SRE) module comprising a multi-scale difference mechanism to improve sensitivity to change regions and a high-dimensional spatial projection module leveraging differential features to boost decoder capabilities; 3) A Mamba-Trans architecture caption generator that efficiently decodes the correspondence between image embeddings and word embeddings. Experiments demonstrate the effectiveness of our proposed modules and the designed Mamba decoder. The SWMCC framework achieves the lowest inference costs, fastest inference speeds on the LEVIR-CC dataset, and excellent performance compared with state-of-the-art (SOTA) models. The code and pre-trained weights will be available at https://github.com/Fullahea/SWMCC.git.
科研通智能强力驱动
Strongly Powered by AbleSci AI