遥感
变更检测
计算机科学
特征(语言学)
钥匙(锁)
图像(数学)
融合
人工智能
噪音(视频)
传感器融合
深度学习
特征提取
语义鸿沟
编码(集合论)
限制
图像融合
语义特征
特征检测(计算机视觉)
模式
计算机视觉
语义学(计算机科学)
遥感应用
上下文图像分类
像素
模式识别(心理学)
图像处理
桥(图论)
特征学习
图像分割
目标检测
模态(人机交互)
语义变化
作者
Yijun Zhou,Yikui Zhai,Zilu Ying,Tingfeng Xian,Wenlve Zhou,Zhiheng Zhou,Xiaolin Tian,Xudong Jia,Hongsheng Zhang,C. L. Philip Chen
标识
DOI:10.1109/tgrs.2025.3631882
摘要
Although deep learning has advanced remote sensing change detection (RSCD), most methods rely solely on image modality, limiting feature representation, change pattern modeling, and generalization—especially under illumination and noise disturbances. To address this, we propose MMChange, a multimodal RSCD method that combines image and text modalities to enhance accuracy and robustness. An Image Feature Refinement (IFR) module is introduced to highlight key regions and suppress environmental noise. To overcome the semantic limitations of image features, we employ a vision-language model (VLM) to generate semantic descriptions of bi-temporal images. A Textual Difference Enhancement (TDE) module then captures fine-grained semantic shifts, guiding the model toward meaningful changes. To bridge the heterogeneity between modalities, we design an Image-Text Feature Fusion (ITFF) module that enables deep cross-modal integration. Extensive experiments on LEVIR-CD, WHU-CD, and SYSU-CD demonstrate that MMChange consistently surpasses state-of-the-art methods across multiple metrics, validating its effectiveness for multimodal RSCD. Code is available at: https://github.com/yikuizhai/MMChange.
科研通智能强力驱动
Strongly Powered by AbleSci AI