作者
Shuai Ma,Yihong Zhang,Ling Peng,Chen Sun,Baolong Ding,Yongdong Zhu
摘要
UAV object detection is crucial in open water search and rescue missions. Due to varying perspectives and altitudes of UAV images, the apparent size of objects varies significantly. Challenges such as insufficient feature representation and background confusion make open water object detection particularly difficult. Currently, deep learning-based detection methods rely on convolution to extract features at a fixed spatial scale. This limited receptive field leads to insufficient feature representation, causing false detections and missed detections, which severely impact detection accuracy. This paper proposes an efficient, feature-enhanced real-time detection network based on transformer architecture, called OWRT-DETR, to address the challenges of diverse UAV image detection in open water. To the best of our knowledge, a Transformer-based detection network has not yet been explored for open water UAV images. OWRT-DETR incorporates a cross-scale feature pyramid module (CFPIM), multi scale sensing fusion (MSSF), and small object enhancement module (SOEM). These modules enhance cross-scale interaction, cross-channel spatial global association, and local perception of the network, while avoiding increased complexity, improving weak feature representation of small targets, and suppressing easily confused backgrounds. Three public datasets are used to validate the effectiveness of OWRT-DETR. OWRT-DETR achieves an averaged precision (AP) of 51.5%, 45.9%, 50.6% on the SeaDronesSee, Aerial Dataset of Floating Objects, and Aerialbus Ship datasets, exceeding the performance of several state-of-the-art models. To ensure efficiency and reduce computational resources, OWRT-DETR is optimized by reconstructing the backbone network using PConv and Rep, resulting in Light-OWRT-DETR. Compared with OWRT-DETR, Light-OWRT-DETR is faster, uses fewer parameters, requires less computational power, and achieves higher accuracy. The code will be available at https://github.com/mshauima/OWRT-DETR.