计算机科学
图像分割
人工智能
元组
分割
计算机视觉
模式识别(心理学)
遥感
地质学
数学
离散数学
作者
Yuanxin Ye,Jinkun Dai,Liang Zhou,Keyi Duan,Ran Tao,Wei Li,Danfeng Hong
标识
DOI:10.1109/tgrs.2025.3542868
摘要
Deep learning models exhibit promising potential in multimodal remote sensing image semantic segmentation (MRSISS). However, the constrained access to labeled samples for training deep learning networks significantly influences the performance of these models. To address that, self-supervised learning (SSL) methods have garnered significant interest in the remote sensing community. Accordingly, this article proposes a novel multimodal contrastive learning framework based on tuple perturbation, which includes the pretraining and fine-tuning stages. First, a tuple perturbation-based multimodal contrastive learning network (TMCNet) is designed to better explore shared and different feature representations across modalities during the pretraining stage and the tuple perturbation module is introduced to improve the network’s ability to extract multimodal features by generating more complex negative samples. In the fine-tuning stage, we develop a simple and effective multimodal semantic segmentation network (MSSNet), which can reduce noise by using complementary information from various modalities to integrate multimodal features more effectively, resulting in better semantic segmentation performance. Extensive experiments have been carried out on two published multimodal image datasets including optical and synthetic aperture radar (SAR) pairs, and the results show that the proposed framework can obtain more superior performance of semantic segmentation than the current state-of-the-art methods in cases of limited labeled samples. The source code is available at https://github.com/yeyuanxin110/TMCNet-MSSNet.
科研通智能强力驱动
Strongly Powered by AbleSci AI