测距
激光雷达
计算机科学
人工智能
计算机视觉
目标检测
遥感
对象(语法)
激光测距
模式识别(心理学)
地理
光学
激光器
物理
电信
作者
Fuqiang You,Haoran Zhang,Hao Chen
标识
DOI:10.1177/30504554251340241
摘要
Recent studies have shown that pillar-based detectors perform better in terms of both accuracy and speed, but these detectors perform poorly for detecting small objects such as pedestrians and cyclists. To solve this problem, we propose a highly efficient pillar-based model named MASNet, which mainly consists of a pillar mix attention (PMA) module, an attention-pooling operation, and a focal sparse network (FSN) module. The PMA module encodes the pillar features by fusion of the point-wise attention module and the channel-wise attention module. The attention-pooling operation aggregates the attention-encoded pillar features in a more comprehensive way to obtain the most expressive pillar features. In addition, the FSN module exploits the intrinsic sparsity of the data by introducing focal sparse convolution, which enriches the learned pillar features in the foreground without adding redundant pillars in other regions. On the KITTI three-dimensional (3D) Object Detection Benchmark, it achieves a 3D average precision of (77.81%, 60.30%, and 53.92%) in easy, moderate, and hard levels, which outperforms other pillar-based methods for the detection of cyclists. Additionally, our method is only 0.52% lower than the top-ranked method (pillar feature network (PIFENet)) on the KITTI Bird's Eye View pedestrian leaderboard, but our inference speed reaches 41 frames per second ahead of PIFENet by 57.69%.
科研通智能强力驱动
Strongly Powered by AbleSci AI