计算机科学
人工智能
计算机视觉
边距(机器学习)
目标检测
变压器
激光雷达
水准点(测量)
传感器融合
融合
透视图(图形)
机器学习
模式识别(心理学)
工程类
电压
地质学
哲学
电气工程
遥感
语言学
地理
大地测量学
作者
Kashyap Chitta,Aditya Prakash,Bernhard Jaeger,Zehao Yu,Katrin Renz,Andreas Geiger
标识
DOI:10.1109/tpami.2022.3200245
摘要
How should we integrate representations from complementary sensors for autonomous driving? Geometry-based fusion has shown promise for perception (e.g., object detection, motion forecasting). However, in the context of end-to-end driving, we find that imitation learning based on existing sensor fusion methods underperforms in complex driving scenarios with a high density of dynamic agents. Therefore, we propose TransFuser, a mechanism to integrate image and LiDAR representations using self-attention. Our approach uses transformer modules at multiple resolutions to fuse perspective view and bird's eye view feature maps. We experimentally validate its efficacy on a challenging new benchmark with long routes and dense traffic, as well as the official leaderboard of the CARLA urban driving simulator. At the time of submission, TransFuser outperforms all prior work on the CARLA leaderboard in terms of driving score by a large margin. Compared to geometry-based fusion, TransFuser reduces the average collisions per kilometer by 48%.
科研通智能强力驱动
Strongly Powered by AbleSci AI