情态动词
多光谱图像
计算机科学
频道(广播)
相似性(几何)
人工智能
重采样
余弦相似度
对象(语法)
计算机视觉
模式识别(心理学)
图像(数学)
电信
化学
高分子化学
作者
Junbo Jang,Chanyeong Park,Heegwang Kim,Jiyoon Lee,Joonki Paik
标识
DOI:10.1109/wacv61041.2025.00914
摘要
Images obtained from different modalities can effectively enhance the accuracy and reliability of the detection model by complementing specialized information from visible (RGB) and infrared (IR) images. However, integrating information from multiple modalities faces the following challenges: 1) distinct characteristics of RGB and IR images lead to the problem of modality imbalance, 2) fusing multimodal information can greatly affect the detection accuracy, as some of the unique information provided by each modality is lost during the integration process, and 3) RGB and IR images are fused while preserving the noise of each modality. To address these issues, we propose a novel multi spectral object detection network which contains two main components; 1) Cross-modal Information Complementary (CIC) module, and 2) Cosine Similarity Channel Resampling (CSCR) module. The proposed method addresses the modality imbalance problem and efficiently fuses RGB and IR images in the feature level. Extensive experimental results on LLVIP, FLIR, $M^{3}FD$, VEDAI and KAIST benchmark datasets, verify the effectiveness and generalization performance of the proposed multispectral object detection network compared with other state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI