计算机科学
变压器
光学(聚焦)
融合
图像(数学)
人工智能
计算机视觉
人机交互
电气工程
光学
电压
语言学
物理
工程类
哲学
作者
Huifang Zhai,Wenyi Zheng,Yuncan Ouyang,Xiaofei Pan,Wanli Zhang
标识
DOI:10.1016/j.engappai.2024.107967
摘要
Multi-focus image fusion technology is wildly used in the digital photography, and also be considered as a pre-task of other high level vision tasks. The main purpose is producing an all-in-focus image from multiple partly focused sources accurately, naturally and efficiently. Nowadays, transformer models have achieved great success in numerous vision-related tasks, and its powerful attention modeling ability brings significant possibilities for focus property detection. To make multi-focus image fusion technology even further, in this paper, a novel multi-focus image fusion method using interactive transformer and asymmetric soft sharing is proposed. First, for taking the transformer’s advantages in global context modeling and improve its limitations in diversity and efficiency, a locally-enhanced interactive approach is devised. More specifically, it used cross-scale and cross-domain computation strategy to overcome the transformer’s certain limitations when comes to domain-specific task, and simultaneously alleviate the insufficient local feature perception and redundant computational cost of existing approach. Second, to cope with the problems of fusion image distortion and artifacts, the proposed method adopts a multi-task learning strategy with asymmetric soft sharing. It predicts the fusion images and decision maps at the same time for avoiding image distortion and further obtaining natural fusion effect. Experimental results reveal that, the proposed method reaches promising focus property detection performance and high-fidelity fusion results. Moreover, when compared with other state-of-the-art methods on five multi-focus image datasets, it is more prominent in both qualitative and quantitative analysis as well as efficiency. The source code is available at https://github.com/zwy0913/MFFT.
科研通智能强力驱动
Strongly Powered by AbleSci AI