计算机科学
稳健性(进化)
目标检测
变压器
人工智能
保险丝(电气)
计算机视觉
探测器
特征提取
编码(集合论)
源代码
旋转(数学)
模式识别(心理学)
算法
集合(抽象数据类型)
电压
电信
基因
操作系统
电气工程
物理
工程类
量子力学
生物化学
化学
程序设计语言
作者
Yang Tian,Mengmeng Zhang,Jinyu Li,Yangfan Li,Yang Hong,Wei Li
标识
DOI:10.1109/tgrs.2024.3351156
摘要
Feature pyramid network transformer decoder (FPNFormer) module, which can effectively deal with the strong rotation arbitrary of remote sensing images while improving the expressiveness and robustness of the model. It is a plug-and-play module that can be well transferred to various detection models and significantly improves performance. Specifically, we use the computational method of transformer decoder to deal with the problem that the image has any orientation, and its output weakly depends on the order of the input data. We apply it to the feature fusion stage and design two ways top-down and down-top to fuse features of different scales, which enables the model to have a more vital ability to perceive objects at different scales and angles. Experiments on commonly used benchmarks (DOTA1.0, DOTA1.5, SSDD, and RSDD) demonstrate that the proposed FPNFormer module significantly improves the performance of multiple arbitrary-oriented object detectors, such as 1.99% map improvement of rotated retinanet on DOTA's cross-validation set. On RSDD datasets, the baseline model using FPNFormer improves the map of large objects by 5.1%. Combined with more competitive models, the proposed method can achieve a 79.39% map on the DOTA1.0 dataset. The code is available at https://github.com/bityangtian/FPNFormer .
科研通智能强力驱动
Strongly Powered by AbleSci AI