情态动词
计算机科学
行人检测
人工智能
计算机视觉
能见度
目标检测
交叉口(航空)
探测器
模式识别(心理学)
行人
光学
工程类
电信
物理
航空航天工程
化学
高分子化学
运输工程
作者
Napat Wanchaitanawong,Masayuki Tanaka,Takashi Shibata,Masatoshi Okutomi
标识
DOI:10.1117/1.jei.32.1.013025
摘要
Multi-modal pedestrian detection, which integrates visible and thermal sensors, has been developed to overcome many limitations of visible-modal pedestrian detection, such as poor illumination, cluttered background, and occlusion. By adopting the combination of multiple modalities, we can efficiently detect pedestrians even with poor visibility. Nevertheless, the critical assumption of multi-modal pedestrian detection is that multi-modal images are perfectly aligned. In general, however, this assumption often becomes invalid in real-world situations. Viewpoints of the different modal sensors are usually different. Then, the positions of pedestrians on the different modal images have disparities. We proposed a multi-modal faster-RCNN specifically designed to handle misalignment between two modalities. The faster-RCNN consists of a region proposal network (RPN) and a detector. We introduce position regressors for both modalities in the RPN and the detector. Intersection over union (IoU) is one of the useful metrics for object detection but is defined only for a single-modal image. We extend it into multi-modal IoU to evaluate the preciseness of both modalities. Our experimental results with the proposed evaluation metrics demonstrate that the proposed method has comparable performance with state-of-the-art methods and outperforms them for data with significant misalignment.
科研通智能强力驱动
Strongly Powered by AbleSci AI