特征(语言学)
人工智能
红外线的
特征提取
情态动词
融合
保险丝(电气)
模式识别(心理学)
计算机科学
计算机视觉
图像融合
比例(比率)
代表(政治)
特征检测(计算机视觉)
特征向量
突出
自编码
融合规则
财产(哲学)
骨干网
加权
作者
Shuying Huang,Kai Zhang,Yong Yang,Weiguo Wan
标识
DOI:10.1016/j.patcog.2025.112419
摘要
• A novel deep learning-based method PCFFusion is proposed for infrared and visible image fusion. • An FDM is designed to extract different scale features of infrared and visible images, which can improve the network’s feature representation ability. • A Transformer-based CMFEM is constructed to achieve correction and enhancement of the two modal features at different scales. • A CDFM is constructed to achieve fusion of features at different scales through interactive compensation of two modal features. • Extensive experiments verify that the proposed network outperforms SOTA networks. Infrared and visible image fusion (IVIF) aims to fuse thermal target information in infrared images and spatial texture information in visible images, improving the observability and comprehensibility of the fused images. Currently, most IVIF methods suffer from the loss of salient target information and texture details in fused images. To alleviate this problem, a progressive cross-modal feature fusion network (PCFFusion) for IVIF is proposed, which comprises two stages: feature extraction and feature fusion. In the feature extraction stage, to enhance the network’s feature representation capability, a feature decomposition module (FDM) is constructed to extract two modal features of different scales by defining a feature decomposition operation (FDO). In addition, by establishing correlations between the high- frequency and low-frequency components of two modal features, a cross-modal feature enhancement module (CMFEM) is built to realize correction and enhancement of the two features at each scale. The feature fusion stage achieves the fusion of two modal features at each scale and the supplementation of adjacent scale features by constructing three cross-domain fusion module (CDFMs). To constrain the fused results preserve more salient targets and richer texture details, a dual-feature fidelity loss function is defined by constructing a salient weight map to balance the two loss terms. Extensive experiments demonstrate that fusion results of the proposed method highlight prominent targets from infrared images while retaining rich background details from visible images, and the performance of PCFFusion is superior to some advanced methods. Specifically, compared to the optimal results obtained by other comparison methods, the proposed network achieves an average increase of 30.35 % and 10.9 % in metrics Mutual Information (MI) and Standard deviation (SD) on the TNO dataset, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI