计算机科学
人工智能
目标检测
计算机视觉
多光谱图像
特征提取
像素
行人检测
高光谱成像
特征(语言学)
模式识别(心理学)
遥感
工程类
行人
地理
语言学
哲学
运输工程
作者
Xiongxin Zou,Tangle Peng,Yimin Zhou
出处
期刊:IEEE Transactions on Industrial Informatics
[Institute of Electrical and Electronics Engineers]
日期:2024-03-01
卷期号:20 (3): 3814-3823
被引量:1
标识
DOI:10.1109/tii.2023.3310792
摘要
Timely and effective search and rescue (SAR) is highly desired in the disaster rescues. Unmanned aerial vehicles (UAVs) can quickly conduct aerial searches to assist SAR with the equipped sensors. A visible-thermal human detection model based on an improved you only look once version 5 (YOLOv5) network is proposed to compensate the deficiencies of visible data with thermal images. The complementary information between the visible images and thermal images are considered with a partially shared two-stream backbone network, so as to better preserve the information of each branch while reducing the domain distinction and extracting the modality-invariant features. Features of the two modalities are fused via a fusion module with a multidimensional attention mechanism. By taking the pixels outside the region of interest as negative samples, the extra loss function can suppress the uncorrelated feature extraction of the backbone to enhance the effective feature representation. The proposed visible-thermal human detection model has been deployed on the UAV with satisfied human detection performance. Comparative experiments on the multispectral pedestrian dataset KAIST have also been performed to demonstrate that the proposed model outperforms other visible-thermal object detection models with log-average miss rate.
科研通智能强力驱动
Strongly Powered by AbleSci AI