图像融合
计算机科学
人工智能
接头(建筑物)
计算机视觉
图像配准
情态动词
图像处理
融合
网(多面体)
图像(数学)
模式识别(心理学)
数学
工程类
建筑工程
语言学
化学
哲学
几何学
高分子化学
作者
Ming Lu,Min Jiang,Xuefeng Tao,Jun Kong
标识
DOI:10.1109/tip.2025.3586507
摘要
Joint multi-modal image registration and fusion (JMIRF) typically follows a register-first, fuse-later paradigm. It has a registration module to align parallax images and a fusion module to fuse registered images. Existing research typically focuses on the mutual enhancement between the two modules, but this is essentially a straightforward combination rather than an efficient, unified network. Moreover, executing the two modules separately may cause inefficiency, as the total runtime is merely the sum of both steps without investigating potential shared structures. In this paper, we propose an Adaptive Unified Network (AU-Net) following a novel end-to-end paradigm called Feature-Level Joint Training (FLJT). Firstly, AU-Net learns registration and fusion within a unified network through shared structure and hierarchical semantic interaction. A multi-level dynamic fusion module is designed to adaptively fuse input features from different scales and modalities. Secondly, the image-to-image translation based on Denoising Diffusion Probabilistic Models (DDPMs) is introduced to train AU-Net using simple and reliable single-modal metrics. Unlike previous unidirectional translation, we explore bidirectional translation to provide additional implicit branch supervision. Furthermore, a cache-like scheme is proposed to elegantly circumvent the additional computational overhead caused by the iterative denoising of DDPMs. Finally, our method was validated on two publicly available datasets, demonstrating advantages over state-of-the-art methods in terms of qualitative evaluation, quantitative evaluation, and computational complexity analysis. The code will be publically available at https://github.com/luming1314/AU-Net.
科研通智能强力驱动
Strongly Powered by AbleSci AI