人工智能
计算机科学
计算机视觉
目标检测
卷积神经网络
阈值
交叉口(航空)
模式识别(心理学)
图像(数学)
工程类
航空航天工程
作者
Aarfa Bano Sheikh,Apurva Baru,Sanjana Shinde Desai,Supriya Mangale
标识
DOI:10.1117/1.jrs.16.024523
摘要
Object detection is a technique used to localize and classify objects in an image or a video sequence. It is an emerging topic of research in the field of computer vision. However, detections in a video are affected by sensor-specific challenges. A convolutional neural network-based You Only Look Once, version 3 (YOLOv3) object detection algorithm was used to get optimized computation time and accuracy. In the proposed methodology, the YOLOv3 architecture extracted significant features from both the visible and thermal imaging domains, and an adaptive fusion of both domains was performed to determine the dominant imaging domain and provide robust detections. The resulting YOLOv3 detections included the bounding box coordinates, confidence score, and class from each imaging domain, which were fused implicitly into a single plane. The sensor domain having the maximum number of object detections was chosen as the reference to be compared with the other domain for the adaptive fusion process. After fusion, the algorithm removed redundancy using adaptive intersection over union thresholding. The mean average precision result obtained from the fusion algorithm was 44.25%. A comparative study was also carried out between pre-trained common objects in context weights and custom CAMEL dataset weights; it showed the significance of using adaptive fusion in challenging situations such as nighttime, shadow, varying illumination, moving camera, and crowd.
科研通智能强力驱动
Strongly Powered by AbleSci AI