BEV transformer for visual 3D object detection applied with retentive mechanism

变压器 计算机科学 机制(生物学) 计算机视觉 人工智能 工程类 物理 电气工程 电压 量子力学
作者
Jincheng Pan,Xiaoci Huang,Suyun Luo,Fang Ma
出处
期刊:Transactions of the Institute of Measurement and Control [SAGE Publishing]
标识
DOI:10.1177/01423312241308367
摘要

Three-dimensional (3D) vision perception tasks utilizing multiple cameras are pivotal for autonomous driving systems, encompassing both 3D object detection and map segmentation. We introduce a novel approach dubbed RetentiveBEV, leveraging Transformer to learn spatiotemporal features from Bird’s Eye View (BEV) perspectives. These BEV representations form the foundational layer for further autonomous driving tasks. Succinctly, spatial features within regions of interest (ROIs) are harvested via spatial cross-attention, while temporal dynamics are integrated using temporal self-attention, enriching the BEV with historical data. Our spatial cross-attention is enhanced with a retentive mechanism, prioritizing information surrounding the focal points and enabling the decomposition of this attention mechanism to bolster computational efficiency. On the nuScenes data set test split, our approach achieves a nuScenes Detection Score (NDS) score of 60.4%, without additional training data, which is an 8.7% improvement over the baseline (BEVFormer-base), and is close to the current state-of-the-art method SparseBEV, which gets NDS 65.7% as of August 2024. On the Val split of nuScenes, our method achieves the performance of 55.8 NDS while maintaining a real-time inference speed of 25.3 FPS, and we are currently working on further accelerating inference using TensorRT on the existing basis (the specification of mAP and NDS would be illustrated by equations (12) and (13)). The integration of the retentive mechanism notably boosts the precision and recall in 3D object detection while also expediting the inference process.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
ooa4321完成签到,获得积分10
刚刚
曾经耳机完成签到 ,获得积分10
1秒前
1秒前
墨澜完成签到,获得积分10
1秒前
Orange应助顺利毕业采纳,获得10
2秒前
4秒前
China发布了新的文献求助10
4秒前
9527发布了新的文献求助10
4秒前
5秒前
完美世界应助嘟嘟采纳,获得10
5秒前
彭于晏应助墨澜采纳,获得10
6秒前
CipherSage应助顺心飞雪采纳,获得10
6秒前
fly关闭了fly文献求助
6秒前
wh发布了新的文献求助10
7秒前
jzyy发布了新的文献求助10
8秒前
8秒前
8秒前
小阮完成签到,获得积分10
9秒前
蔺一鸣发布了新的文献求助10
10秒前
11秒前
zho应助随风采纳,获得10
11秒前
所所应助木习习采纳,获得10
12秒前
14秒前
zho应助shangx采纳,获得10
16秒前
16秒前
小夜完成签到,获得积分10
16秒前
蔺一鸣完成签到,获得积分10
17秒前
Jasper应助嘎嘎嘎嘎采纳,获得10
18秒前
fortune丶完成签到,获得积分10
18秒前
19秒前
ZhouYW应助treering采纳,获得10
20秒前
acow发布了新的文献求助10
20秒前
薛枏完成签到,获得积分10
20秒前
20秒前
21秒前
24秒前
小二发布了新的文献求助10
25秒前
meiguang完成签到,获得积分20
25秒前
Epiphany发布了新的文献求助10
26秒前
26秒前
高分求助中
Les Mantodea de Guyane Insecta, Polyneoptera 2500
Mobilization, center-periphery structures and nation-building 600
Technologies supporting mass customization of apparel: A pilot project 450
China—Art—Modernity: A Critical Introduction to Chinese Visual Expression from the Beginning of the Twentieth Century to the Present Day 430
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3792086
求助须知:如何正确求助?哪些是违规求助? 3336334
关于积分的说明 10280411
捐赠科研通 3052945
什么是DOI,文献DOI怎么找? 1675431
邀请新用户注册赠送积分活动 803446
科研通“疑难数据库(出版商)”最低求助积分说明 761366