计算机科学
人工智能
计算机视觉
目标检测
投影(关系代数)
校准
代表(政治)
对象(语法)
理论(学习稳定性)
卷积神经网络
模式识别(心理学)
机器学习
算法
数学
统计
政治
政治学
法学
作者
Yongqing Chen,Nanyu Li,Dandan Zhu,Charles Zhou,Zhuhua Hu,Yong Bai,Jun Yan
标识
DOI:10.1109/jiot.2024.3379471
摘要
3D object detection based on multi-view cameras and bird's-eye view (BEV) representation is a key task for autonomous driving, as it enables the perception systems to understand the surrounding scenes. However, most existing BEV representation methods rely on the projection matrix of camera intrinsic and extrinsic parameters, which requires a complex and time-consuming calibration process that may introduce errors and degrade the detection performance. Moreover, the calibration results may vary due to environmental changes and affect the stability of the detection system. To address this problem, we propose a calibration-free 3D object detection method that leverages a group-equivariant convolutional network to extract features from multi-view images and a projection network module to learn the implicit 3D-to-2D projection relationship for obtaining BEV representation. Furthermore, we employ contrastive learning to pre-train the projection network module without using manually annotated data. By exploiting the multi-view camera data through contrastive learning, our proposed method eliminates the need for tedious calibration, avoids calibration errors, and reduces the dependence on a large amount of annotated data for calibration-free 3D object detection. We evaluate our method on the nuScenes dataset and demonstrate its competitive performance. Our method improves the stability and reliability of 3D object detection in long-term autonomous driving.
科研通智能强力驱动
Strongly Powered by AbleSci AI