人工智能
计算机视觉
计算机科学
特征(语言学)
雷达
目标检测
传感器融合
雷达成像
特征提取
图像融合
占用网格映射
模式识别(心理学)
图像(数学)
移动机器人
电信
哲学
语言学
机器人
作者
Jingwei Zhang,Ming Zhang,Zicheng Fang,Yulong Wang,Xian Zhao,Shiliang Pu
标识
DOI:10.1109/itsc48978.2021.9564627
摘要
Obstacle perception based on radar sensor has drawn wide attentions in autonomous driving due to robust performance and low cost. It is significant to utilize fusion, e.g., camera information, to further enhance the radar perception ability. Although much progress has been made, we still observe two problems: First, the spatial alignment among multi-modal data is intractable when involving multiple radar and camera sensors. Second, most existing works are based on object-level fusion, which inevitably has information loss leading to a performance degradation. To this end, we propose a feature-level fusion detection framework based on multiple radars and cameras, termed as the RVDet. We first establish an occupancy grid map by using 4 corner radars and extract radar features in the bird's eye view(BEV). Meantime, the image features of 4 fish-eye cameras are obtained using a pretraining vision detection model. Then, an adaptive projection network is employed to transform all the 4 image features to the BEV domain and integrate them to a dense spatial feature map aligned with the radar feature. Last, the carefully aligned multi-modal feature maps are jointly sent to a deep fusion network to predict final fused detection results. Experiments show that both object detection and positioning performance achieve significant gains by the proposed method in a custom dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI