Optimized Data Distribution Learning for Enhancing Vision Transformer‐Based Object Detection in Remote Sensing Images

人工智能 计算机科学 计算机视觉 目标检测 变压器 模式识别(心理学) 工程类 电压 电气工程
作者
Huaxiang Song,Junping Xie,Yunyang Wang,Lihua Fu,Yang Zhou,Xing Zhou
出处
期刊:Photogrammetric Record [Wiley]
卷期号:40 (189) 被引量:1
标识
DOI:10.1111/phor.70004
摘要

ABSTRACT Existing Vision Transformer (ViT)‐based object detection methods for remote sensing images (RSIs) face significant challenges due to the scarcity of RSI samples and the over‐reliance on enhancement strategies originally developed for natural images. This often leads to inconsistent data distributions between training and testing subsets, resulting in degraded model performance. In this study, we introduce an optimized data distribution learning (ODDL) strategy and develop an object detection framework based on the Faster R‐CNN architecture, named ODDL‐Net. The ODDL strategy begins with an optimized augmentation (OA) technique, overcoming the limitations of conventional data augmentation methods. Next, we propose an optimized mosaic algorithm (OMA), improving upon the shortcomings of traditional Mosaic augmentation techniques. Additionally, we introduce a feature fusion regularization (FFR) method, addressing the inherent limitations of classic feature pyramid networks. These innovations are integrated into three modular, plug‐and‐play components—namely, the OA, OMA, and FFR modules—ensuring that the ODDL strategy can be seamlessly incorporated into existing detection frameworks without requiring significant modifications. To evaluate the effectiveness of the proposed ODDL‐Net, we develop two variants based on different ViT architectures: the Next ViT (NViT) small model and the Swin Transformer (SwinT) tiny model, both used as detection backbones. Experimental results on the NWPU10, DIOR20, MAR20, and GLH‐Bridge datasets demonstrate that both variants of ODDL‐Net achieve impressive accuracy, surpassing 23 state‐of‐the‐art methods introduced since 2023. Specifically, ODDL‐Net‐NViT attained accuracies of 78.3% on the challenging DIOR20 dataset and 61.4% on the GLH‐Bridge dataset. Notably, this represents a substantial improvement of approximately 23% over the Faster R‐CNN‐ResNet50 baseline on the DIOR20 dataset. In conclusion, this study demonstrates that ViTs are well suited for high‐accuracy object detection in RSIs. Furthermore, it provides a straightforward solution for building ViT‐based detectors, offering a practical approach that requires little model modification.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
Hhbbb发布了新的文献求助10
1秒前
yanweifu发布了新的文献求助10
1秒前
leyi完成签到 ,获得积分10
1秒前
lxl发布了新的文献求助10
1秒前
小小郭发布了新的文献求助10
1秒前
文6发布了新的文献求助10
2秒前
3秒前
arniu2008发布了新的文献求助10
4秒前
4秒前
hajimi发布了新的文献求助10
5秒前
慕青应助七剑自然采纳,获得10
5秒前
5秒前
www发布了新的文献求助10
5秒前
zhao发布了新的文献求助10
7秒前
XQQDD举报小白猫求助涉嫌违规
7秒前
Atropine发布了新的文献求助10
8秒前
9秒前
沉默的霆完成签到,获得积分10
9秒前
刘星星完成签到 ,获得积分10
10秒前
nanah完成签到,获得积分10
11秒前
11秒前
吭吭唧唧发布了新的文献求助10
11秒前
温柔斑马发布了新的文献求助10
11秒前
12秒前
www完成签到,获得积分10
12秒前
咯咚发布了新的文献求助10
13秒前
千夜冰柠萌完成签到,获得积分10
13秒前
充电宝应助贪玩岱周采纳,获得10
14秒前
七大洋的风完成签到,获得积分10
14秒前
ding应助dong采纳,获得10
14秒前
14秒前
义气衬衫发布了新的文献求助20
15秒前
16秒前
黑虎发布了新的文献求助10
16秒前
PhDL1发布了新的文献求助10
17秒前
田様应助连敏锐采纳,获得10
17秒前
17秒前
18秒前
ruiii完成签到 ,获得积分10
19秒前
高分求助中
Overcoming Stigma and Bias in Obesity Management 800
Malcolm Fraser : a biography 700
Signals, Systems, and Signal Processing 610
Bounds for Statistical Estimation in Semiparametric Models 500
Climate change and sports: Statistics report on climate change and sports 500
Forced degradation and stability indicating LC method for Letrozole: A stress testing guide 500
A Foreign Missionary on the Long March: The Unpublished Memoirs of Arnolis Hayman of the China Inland Mission 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6466993
求助须知:如何正确求助?哪些是违规求助? 8273199
关于积分的说明 17640227
捐赠科研通 5542187
什么是DOI,文献DOI怎么找? 2908098
邀请新用户注册赠送积分活动 1885061
关于科研通互助平台的介绍 1733378