相关系数
人工智能
人工神经网络
图像融合
模式识别(心理学)
计算机视觉
红外线的
图像处理
计算机科学
相关性
数学
图像(数学)
机器学习
光学
物理
几何学
作者
Xin Guo,Tongwei Lu,Lei Chen
标识
DOI:10.1117/1.jei.33.6.063030
摘要
Infrared-visible image fusion (IVF) proves to be advantageous in conveying scene information comprehensively. Although convolution neural network (CNN)–based methods skillfully extract local information, their global information modeling capabilities remain insufficient. Conversely, attention-based methods excel in capturing global information but often neglect detailed features. Existing CNN-transformer methods have achieved excellent performance, but they still exhibit shortcomings in capturing image details. In response to these limitations, we introduce a Swin Transformer integrated with an invertible neural network (INN) and correlation coefficient (CC) assistance for the IVF network, denoted as SICFuse. SICFuse incorporates an attention enhancement module to enhance the extracted image information. For global information modeling, we employ Swin blocks, capitalizing on their characteristics to deliver multi-scale information and implementing a sliding window attention mechanism, thereby further diminishing computational complexity. Recognizing attention-based methods' efficacy in emphasizing global information, we address the shortfall in modeling local information by introducing INN blocks. These blocks effectively preserve image information, enabling lossless transmission. To enhance fusion results, CC joints the loss function. Ultimately, the image reconstruction module combines a fused image based on both global information and local features. The whole approach aims to leverage the strengths of attention mechanisms and inversion, bridging the gap between local and global information modeling for superior image fusion outcomes. The results on MSRS, M3FD, and TNO datasets demonstrate our promising performance on IVF tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI