计算机科学
跟踪(教育)
人工智能
比例(比率)
深度学习
单眼
特征(语言学)
判别式
计算机视觉
增采样
块(置换群论)
特征提取
目标检测
骨干网
视频跟踪
模式识别(心理学)
编码器
精确性和召回率
卷积神经网络
推论
单目视觉
缩放空间
作者
Ruida Yang,Chaofang Hu,Yiming Jv,Yanqiao Chen
标识
DOI:10.23919/ccc64809.2025.11179124
摘要
A novel method for target tracking of high-altitude flying unmanned aerial vehicle (UAV) based on monocular vision is proposed in this paper. Firstly, the classic YOLOv5 object detection network is improved. By performing upsampling in the neck network of YOLOv5, a feature map with 128 channels and size of 160*160 is generated. The feature map is first adjusted to match the number of channels and spatial dimensions of the third backbone layer's output, enabling their fusion. The fused output enables the integration of a small target detection layer into the network. Furthermore, a transformer encoder module is introduced in both the backbone and neck networks of YOLOv5, and a convolutional block attention module (CBAM) is also incorporated in the neck network to further enhance the networks feature extraction capabilities. The improved network is trained using a small target dataset. Next, the improved YOLOv5 is combined with the fast discriminative scale space tracking algorithm (FDSST) tracking algorithm and applied to the monocular UAV tracking simulation platform. Finally, the performance of the proposed algorithm is demonstrated in the simulation. The improved YOLOv5 model achieves higher recall and mAP than the original on VisDrone2019. Finally, the effectiveness of the proposed method in UAV tracking of ground-based small scale moving targets is verified through gazebo simulations.
科研通智能强力驱动
Strongly Powered by AbleSci AI