Instance-Aware Encoder for Complementary Information Maximization in Self-Supervised Infrared and Visible Image Fusion

计算机科学编码器人工智能自编码转化（遗传学）计算机视觉最大化图像融合鉴定（生物学）模式识别（心理学）对象（语法）目标检测模态（人机交互）光学（聚焦）红外线的一致性（知识库）融合领域（数学）传感器融合信息来源（数学）图像（数学）多光谱图像融合规则特征提取图像检索图像传感器多源视觉对象识别的认知神经科学人工神经网络

作者

Xueling Chen,Hua-Xi Chen,Qixiang Gao,Wuxia Zhang,Chen Ding,Wei Sun,Yanning Zhang

出处

期刊：IEEE Transactions on Geoscience and Remote Sensing [Institute of Electrical and Electronics Engineers]
日期：2025-01-01 卷期号：63: 1-19

标识

DOI：10.1109/tgrs.2025.3613126

摘要

In recent years, infrared and visible image fusion (IVF) has garnered sustained attention in the field of UAV-based remote sensing. IVF fuses information of the two source images to produce an informative image, which may be favorable in object detection and scene perception. Mainstream methods focus on designing the network architectures of fusion models. Due to the lack of ground-truth fused images, most IVF methods are unsupervised and typically resort to simple pixel-wise or feature-wise losses for self-supervision. However, they usually ignore the fact that complementary features between infrared and visible images are distributed among different objects, which are inherently in instance-wise. To address this problem, we propose an Instance-aware Encoder (InE), which acts as a learning-based loss that constrains the maximization of complementary information in the fused image through a self-supervised manner. Specifically, we train two InEs to identify modality-specific features within infrared and visible images, respectively. Subsequently, the two InEs are utilized to enforce the consistency of unique features between the source and the fused images. To enhance the instance-awareness of InE, we integrate the semantic information of the source images into the encoder via a cross-attention mechanism. With the semantically-enhanced features, our trained InE constrains the fusion model to effectively integrate complementary information of different objects. Moreover, we propose a novel modality transformation model to reduce the modality differences of the two source images, facilitating the identification of modality-specific information. Experimental results on public datasets have demonstrated the effectiveness of our proposed InE. We also collect a new UAV-based dataset for evaluation and will make it publicly available along with the code upon acceptance of the paper.

求助该文献

最长约 10秒，即可获得该文献文件

Instance-Aware Encoder for Complementary Information Maximization in Self-Supervised Infrared and Visible Image Fusion

今日热心研友