计算机科学
编码器
人工智能
自编码
转化(遗传学)
计算机视觉
最大化
图像融合
鉴定(生物学)
模式识别(心理学)
对象(语法)
目标检测
模态(人机交互)
光学(聚焦)
红外线的
一致性(知识库)
融合
领域(数学)
传感器融合
信息来源(数学)
图像(数学)
多光谱图像
融合规则
特征提取
图像检索
图像传感器
多源
视觉对象识别的认知神经科学
人工神经网络
作者
Xueling Chen,Hua-Xi Chen,Qixiang Gao,Wuxia Zhang,Chen Ding,Wei Sun,Yanning Zhang
标识
DOI:10.1109/tgrs.2025.3613126
摘要
In recent years, infrared and visible image fusion (IVF) has garnered sustained attention in the field of UAV-based remote sensing. IVF fuses information of the two source images to produce an informative image, which may be favorable in object detection and scene perception. Mainstream methods focus on designing the network architectures of fusion models. Due to the lack of ground-truth fused images, most IVF methods are unsupervised and typically resort to simple pixel-wise or feature-wise losses for self-supervision. However, they usually ignore the fact that complementary features between infrared and visible images are distributed among different objects, which are inherently in instance-wise. To address this problem, we propose an Instance-aware Encoder (InE), which acts as a learning-based loss that constrains the maximization of complementary information in the fused image through a self-supervised manner. Specifically, we train two InEs to identify modality-specific features within infrared and visible images, respectively. Subsequently, the two InEs are utilized to enforce the consistency of unique features between the source and the fused images. To enhance the instance-awareness of InE, we integrate the semantic information of the source images into the encoder via a cross-attention mechanism. With the semantically-enhanced features, our trained InE constrains the fusion model to effectively integrate complementary information of different objects. Moreover, we propose a novel modality transformation model to reduce the modality differences of the two source images, facilitating the identification of modality-specific information. Experimental results on public datasets have demonstrated the effectiveness of our proposed InE. We also collect a new UAV-based dataset for evaluation and will make it publicly available along with the code upon acceptance of the paper.
科研通智能强力驱动
Strongly Powered by AbleSci AI