计算机科学
人工智能
计算机视觉
RGB颜色模型
多光谱图像
目标检测
分割
模式识别(心理学)
作者
Maoxun Yuan,Xiaorong Shi,Nan Wang,Yinyan Wang,Xingxing Wei
标识
DOI:10.1016/j.inffus.2024.102246
摘要
The integration of multispectral data in object detection, especially visible and infrared images, has been the subject of considerable attention recently. Complementary information from visible (RGB) and infrared (IR) images can ameliorate the challenges posed by variable lighting conditions, rendering them an invaluable resource in many fields, including RGB-IR object detection, RGB-IR semantic segmentation, and RGB-IR crowd counting. However, existing methods still suffer from weak misalignment and fusion imprecision problems. These two problems present significant challenges for accurate object detection. In this paper, our primary focus is to solve the above problems in RGB-IR object detection tasks. Specifically, we first propose a Translation-Scale-Rotation Alignment (TSRA) module to align two modality features from region proposals. Based on the aligned region features, we introduce a Complementary Fusion Transformer (CFT) module to capture the complementary features. These two modules can be coupled in an unified Region of Interest (RoI) detection head called Cascade Alignment-Guided Transformer (CAGT) to obtain the robust fused features. Finally, based on CAGT, a region feature alignment and fusion detector called CAGTDet is constructed for RGB-IR object detection. Through comprehensive experiments on the aerial DroneVehicle dataset, our method effectively mitigates the impact of these two issues, resulting in robust detection results. Moreover, to evaluate the generalization of our method, we also perform experiments on the nature images sampled from the KAIST multispectral pedestrian dataset. The results show that our method surpasses other state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI