卷积(计算机科学)
计算机科学
人工智能
目标检测
特征(语言学)
计算机视觉
模式识别(心理学)
特征提取
对象(语法)
融合
特征学习
人工神经网络
语言学
哲学
作者
Ya Shi,Chenyi Wang,Shengjun Xu,Ming‐Dong Yuan,Feixiang Liu,Lele Zhang
标识
DOI:10.1109/lgrs.2024.3362890
摘要
Object detection (OD) in unmanned aerial vehicle (UAV) images faces many challenges, with diverse-scale objects and small objects being particularly prominent issues. To alleviate these challenges, we propose a novel multiscale feature learning and feature fusion network under the guidance of deformable convolution. First, a deformable convolution-guided feature learning (DCGFL) block is designed in the backbone to extract more effective multiscale features. The DCGFL block leverages the adaptability of deformable convolution to the shapes and scales of objects, akin to spatial attention. Moreover, it also employs channel attention to identify important feature maps. Hence, the proposed backbone possesses the functionality of spatial attention and channel attention. Second, in the neck, we devise a simple generalized feature pyramid network (SimpleGFPN) with several deformable convolution-guided feature fusion (DCGFF) blocks to fuse multiscale features. The proposed neck has cross-layer and cross-scale pathways, facilitating effective information exchange and fusion between shallow spatial and deep semantic features. Third, the SIoU loss is used to better model the bounding box regression loss. Finally, experimental results on the VisDrone2021 and UAVDT datasets show that the proposed method outperforms the compared OD methods. In terms of mean average precision, we obtain 37.8% on VisDrone2021 and 18.5% on UAVDT.
科研通智能强力驱动
Strongly Powered by AbleSci AI