作者
Yukai Ma,Caiping Xi,Ting Ma,Han Xu Sun,Hongyu Lu,Xiang Xu,Xu Chen
摘要
UAV small target detection in urban security, disaster monitoring, agricultural inspection, and other fields faces the challenge of increasing accuracy and real-time requirements. However, existing detection algorithms still have weak small target representation ability, extensive computational resource overhead, and poor deployment adaptability. Therefore, this paper proposes a lightweight algorithm, I-YOLOv11n, based on YOLOv11n, which is systematically improved in terms of both feature enhancement and structure compression. The RFCBAMConv module that combines deformable convolution and channel–spatial attention is designed to adjust the receptive field and strengthen the edge features dynamically. The multiscale pyramid of STCMSP context and the lightweight Transformer–DyHead hybrid detection head are designed by combining the multiscale hole feature pyramid (DFPC), which realizes the cross-scale semantic modeling and adaptive focusing of the target area. A collaborative lightweight strategy is proposed. Firstly, the semantic discrimination ability of the teacher model for small targets is transferred to guide and protect the subsequent compression process by integrating the mixed knowledge distillation of response alignment, feature imitation, and structure maintenance. Secondly, the LAMP–Taylor channel pruning mechanism is used to compress the model redundancy, mainly to protect the key channels sensitive to shallow small targets. Finally, K-means++ anchor frame optimization based on IoU distance is implemented to adapt the feature structure retained after pruning and the scale distribution of small targets of UAV. While significantly reducing the model size (parameter 3.87 M, calculation 14.7 GFLOPs), the detection accuracy of small targets is effectively maintained and improved. Experiments on VisDrone, AI-TOD, and SODA-A datasets show that the mAP@0.5 and mAP@0.5:0.95 of I-YOLOv11n are 7.1% and 4.9% higher than the benchmark model YOLOv11 n, respectively, while maintaining real-time processing capabilities, verifying its comprehensive advantages in accuracy, light weight, and deployment.