人工智能
图像融合
代表(政治)
模态(人机交互)
融合
模式识别(心理学)
计算机科学
生成语法
图像(数学)
计算机视觉
生成模型
数学
语言学
哲学
政治
政治学
法学
作者
Jingwei Xin,Boneng Shi,Nannan Wang,Jie Li,Xinbo Gao
出处
期刊:PubMed
日期:2025-10-06
卷期号:PP
标识
DOI:10.1109/tip.2025.3615680
摘要
Creating a comprehensively representative image while maintaining the merits of various modalities is a key focus of current Multi-Modality Image Fusion research. Existing unified methods often struggle to handle varying types of degradation while extracting modality-shared and modality-specific information from source images, leading to limitations in their generative or representation capabilities under different conditions. To address the challenge, we propose MVFusion, a novel self-supervised masked variational autoencoder framework that simultaneously enhances generative training and representation learning. It is designed to cope with varying image quality and dataset composition with a unified framework while ensuring effective fusion of modality information. Specifically, MVFusion employs a self-supervised masked autoencoder to reduce the impact of redundancy and degradation in the source images, and thus learns the latent distribution of degraded input images in the generative training stage. In addition, we incorporate variational feature learning to further preserve the distinctive modality features in the representation learning stage. Extensive experiments demonstrate that our model achieves promising results in several classical fusion tasks, including infrared-visible, multi-focus, multi-exposure, and medical image fusion. The code is available at https://github.com/shiboneng/MVFusion.
科研通智能强力驱动
Strongly Powered by AbleSci AI