模态(人机交互)
融合
对偶(语法数字)
传感器融合
计算机科学
材料科学
人工智能
语言学
艺术
文学类
哲学
作者
Bo Ni,Jianhuan Qi,Haibin Ni,Xiaoyan Zhou,Jianhua Chang
标识
DOI:10.1109/jsen.2025.3585304
摘要
Fusion of complementary feature information from visible-infrared images can effectively improve the performance of target detection. However, the information of visible images is obscured in low-light environments, resulting in the unsatisfactory performance of existing visible-infrared target detection methods in low-light environments. To address this problem, we introduce dual-modality fusion detection with image-enhancement and four-head adaptively spatial feature fusion network (DFIF), which is to improve the multi-target detection accuracy in low-light environments. DFIF is able to mine more information about the target during the cross-modality fusion process through the proposed low light image enhancement network (LLIE). Furthermore, considering the differences and complementarities between the two modalities, we design a dual-modality attention feature fusion module in the dual-stream feature extraction process to fully integrate the complementary information in the feature extraction stage. Finally, we design four-head adaptively spatial feature fusion network (FASFF) with a big target detection layer that can adaptively learn spatial weights for feature mapping fusion at different scales. By optimizing the weighting parameters, FASFF effectively integrates the features at different levels to achieve higher accuracy target detection. We evaluate it on the LLVIP and M3FD datasets, and our DFIF achieves 74.1% and 73.9% on the mAP metrics, respectively, demonstrating optimal detection performance and establishing a new benchmark for subsequent dual-modality target detection.
科研通智能强力驱动
Strongly Powered by AbleSci AI