点云
帧(网络)
计算机科学
云计算
传感器融合
融合
比例(比率)
人工智能
地理
电信
地图学
语言学
操作系统
哲学
作者
Ye Yue,Honggang Qi,Yongqiang Deng,Juanjuan Li,Hao Liang,Jun Miao
标识
DOI:10.1109/tits.2024.3491784
摘要
In recent years, with the advancement of artificial intelligence technology, autonomous driving technologies have gradually emerged. 3D object detection using point clouds has become a key in this field. Multi-frame fusion of point clouds is a promising technique to enhance 3D object detection for autonomous driving systems. However, most existing multi-frame detection methods focus primarily on utilizing vehicle-side lidar data. Infrastructure-side detection remains relatively unexplored, yet can enhance vital vehicle-road coordination capabilities. To help with this coordination, we propose an efficient multi-frame aggregation multi-scale fusion network specifically for infrastructure-side 3D object detection. First, our key innovation is a novel multi-frame feature aggregation module that effectively integrates information from multiple past point cloud frames to improve detection accuracy. This module comprises a feature pyramid network to fuse multi-scale features, as well as a cross-attention mechanism to learn semantic correlations between different frames over time. Next, we incorporate deformable attention, which reduces the computational overhead of aggregation by sampling locations. We designed Multi-frame and Multi-scale modules, thereby we named the model MAMF-Net. Finally, through extensive experiments on two infrastructure-side datasets including the V2X-Seq-SPD dataset which was released by Baidu corporation, we demonstrate that MAMF-Net delivers consistent accuracy improvements over single frame detectors such as PointPillars, PV-RCNN and TED-S, especially boosting pedestrian detection by 5%. Our approach also surpasses other multi-frame methods designed for vehicle-side point clouds such as MPPNet.
科研通智能强力驱动
Strongly Powered by AbleSci AI