人工智能
计算机视觉
计算机科学
图像融合
小波
融合
图像增强
模式识别(心理学)
图像(数学)
语言学
哲学
作者
Qinghua Wang,Ziwei Li,Shuqi Zhang,Nan Chi,Qionghai Dai
标识
DOI:10.1109/tcsvt.2025.3549459
摘要
Multi-modal image fusion aims to amalgamate pivotal information from various sensor sources to provide informative visual representation in imaging scenes. Rapid and precise fusion of images is crucial for practical applications in fields such as autonomous driving and medical diagnostics. However, the primary challenge lies in balancing computational costs with the effectiveness of feature extraction, while ensuring the robust integration of salient features across modalities. Here, this paper introduces WaveFusion, a wavelet vision transformer equipped with an advanced saliency-guided loss strategy to optimize multi-modal image fusion. Initially, to provide a comprehensive and efficient representation of multi-modal data, we introduce an adaptive wavelet transform module for feature decomposition and reconstruction. Following this, self-attention mechanisms and convolutional networks are naturally applied in parallel to process low-frequency and high-frequency components, resulting in the development of a wavelet-enhanced vision transformer. Secondly, WaveFusion utilizes a dual-aggregation attention approach that improves cross-modal feature complementarity and intra-modal feature coherence within a single fusion module. Furthermore, we propose a dynamic saliency-informed selective loss function to refine the optimization process, with the objective of enhancing critical feature retention and maintaining overall image consistency across fusion scenarios. The efficacy and versatility of our method are validated in both infrared-visible fusion and medical image fusion tasks. Experiment results demonstrate that WaveFusion provides a superior balanced approach that optimizes both fusion performance and cost-efficiency, and additionally improves performance in downstream tasks such as multi-modal semantic segmentation and object detection.
科研通智能强力驱动
Strongly Powered by AbleSci AI