增采样
计算机科学
人工智能
特征(语言学)
计算机视觉
目标检测
遥感
判别式
模式识别(心理学)
图像(数学)
地理
语言学
哲学
作者
Qihong Wu,Bin Zhang,Chang Guo,Lei Wang
出处
期刊:Drones
[Multidisciplinary Digital Publishing Institute]
日期:2023-07-02
卷期号:7 (7): 439-439
被引量:8
标识
DOI:10.3390/drones7070439
摘要
Uncrewed Aerial Vehicles (UAVs) are instrumental in advancing the field of remote sensing. Nevertheless, the complexity of the background and the dense distribution of objects both present considerable challenges for object detection in UAV remote sensing images. This paper proposes a Multi-Branch Parallel Network (MBPN) based on the ViTDet (Visual Transformer for Object Detection) model, which aims to improve object detection accuracy in UAV remote sensing images. Initially, the discriminative ability of the input feature map of the Feature Pyramid Network (FPN) is improved by incorporating the Receptive Field Enhancement (RFE) and Convolutional Self-Attention (CSA) modules. Subsequently, to mitigate the loss of semantic information, the sampling process of the FPN is replaced by Multi-Branch Upsampling (MBUS) and Multi-Branch Downsampling (MBDS) modules. Lastly, a Feature-Concatenating Fusion (FCF) module is employed to merge feature maps of varying levels, thereby addressing the issue of semantic misalignment. This paper evaluates the performance of the proposed model on both a custom UAV-captured WCH dataset and the publicly available NWPU VHR10 dataset. The experimental results demonstrate that the proposed model achieves an increase in APL of 2.4% and 0.7% on the WCH and NWPU VHR10 datasets, respectively, compared to the baseline model ViTDet-B.
科研通智能强力驱动
Strongly Powered by AbleSci AI