语义学(计算机科学)
计算机视觉
人工智能
对象(语法)
计算机科学
雷达
融合
遥感
几何学
计算机图形学(图像)
数学
地质学
程序设计语言
电信
语言学
哲学
作者
Xiaokai Bai,Yu Zhu,Lianqing Zheng,Xiaohan Zhang,Zili Zhou,Xue Zhang,Fang Wang,Jie Bai,Hui‐Liang Shen
出处
期刊:IEEE robotics and automation letters
日期:2024-12-09
卷期号:10 (1): 828-835
被引量:10
标识
DOI:10.1109/lra.2024.3513041
摘要
4D millimeter-wave radar has gained attention as an emerging sensor for autonomous driving in recent years. However, existing 4D radar and camera fusion models often fail to fully exploit complementary information within each modality and lack deep cross-modal interactions. To address these issues, we propose a novel 4D radar and camera fusion method, named SGDet3D, for 3D object detection. Specifically, we first introduce a dual-branch fusion module that employs geometric depth completion and semantic radar PillarNet to comprehensively leverage geometric and semantic information within each modality. Then we introduce an object-oriented attention module that employs localization-aware cross-attention to facilitate deep interactions across modalites by allowing queries in bird's-eye view (BEV) to attend to interested image tokens. We validate our SGDet3D on the TJ4DRadSet and View-of-Delft (VoD) datasets. Experimental results demonstrate that SGDet3D effectively fuses 4D radar data and camera image and achieves state-of-the-art performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI