计算机科学
变压器
机制(生物学)
实时计算
工程类
电气工程
物理
电压
量子力学
作者
Wenyun Zhou,Chengtao Cai,Chenming Li,Hao Xu,Haocheng Shi
标识
DOI:10.1109/tim.2024.3472805
摘要
Real-time acquisition of airport scene information is crucial for airport safety and optimization of airport utilization efficiency. However, detecting airport objects is still a challenging task due to the small size of person and vehicle targets in the airport scene images, insufficient public airport data, and so on, which makes it difficult to achieve high accuracy and real-time detection methods in the airport scene simultaneously. This article proposes a novel airport object detection approach to address the challenge by integrating the advantages of improved you only look once (YOLO), Swin Transformer, and attention mechanism [airport detector—YOLO (AD-YOLO)]. Specifically, we introduce the Swin Transformer, which retains the Transformer’s ability of global attention to obtain features and reduces the drawbacks of computational complexity, into the head network based on YOLOv7 to improve the high-dimensional information feature fusion. We also design an efficient channel spatial attention (ECSA) module and introduce a small object detection layer (SODL) to improve the detection accuracy of small targets in the airport scene. We test the proposed method on the self-constructed multiple airport surveillance dataset (MASD) containing 5736 images captured by actual airport and online airport video. The experimental results show that AD-YOLO achieves 71.6% mean average precision (mAP), exceeding the mAP of the baseline method by 4.4%. The proposed method has 101.4 frames/s (FPS) on the NVIDIA RTX3080 GPU and 17.8 FPS on the Jetson Orin NX, meeting the real-time and accuracy requirements of the airport scene. Finally, the experimental results on the public airport surface surveillance (ASS) dataset show that AD-YOLO outperforms other detection methods, demonstrating its effectiveness.
科研通智能强力驱动
Strongly Powered by AbleSci AI