遥感
计算机科学
频道(广播)
频域
目标检测
对象(语法)
领域(数学分析)
人工智能
计算机视觉
模式识别(心理学)
地质学
电信
数学分析
数学
作者
Yang Xu,Yushan Pan,Zebin Wu,Zhihui Wei,Tianming Zhan
标识
DOI:10.1109/tgrs.2024.3500013
摘要
The detection of oriented objects in remote sensing images remains a daunting challenge due to their complex backgrounds, various sizes, and especially arbitrary orientations. However, most of the existing methods only model the structural features of the images in the spatial domain, while the horizontal convolution kernels limit the model’s ability to perceive object direction information. Furthermore, the frequency features contain rich information about scale, texture, and angle, which can be a good complement to the spatial features. Inspired by this, we propose a multiscale spatial-frequency domain network (MSFN) to utilize spatial-frequency information for oriented object detection, which can be integrated into any convolutional neural network (CNN) architectures seamlessly and perform end-to-end training easily. Firstly, multiscale Haar wavelet transforms are leveraged to extract the multiscale frequency domain features from the image. Subsequently, channel alignment feature fusion module (CA-FFM) is proposed to fuse the high-level semantic features extracted by CNN with the low-level texture features extracted by the wavelet transform in multiscale. Finally, a channel self-attention (CSA)-based spatial-frequency feature perception module (SFPM) is designed to perform self-attention weighted aggregation on the fused features along the channel dimension, thereby constructing a novel spatial-frequency feature extraction backbone network for oriented object detector in remote sensing images. Experimental results on the DOTA and HRSC2016 datasets validate the effectiveness and universality of the proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI