点云
计算机科学
最小边界框
人工智能
计算机视觉
跳跃式监视
激光雷达
帧(网络)
特征(语言学)
对象(语法)
跟踪(教育)
点(几何)
过程(计算)
代表(政治)
领域(数学)
图像(数学)
数学
地理
操作系统
遥感
语言学
哲学
电信
教育学
法学
纯数学
政治学
几何学
心理学
政治
作者
Jiantao Gao,Xu Yan,Weibing Zhao,Zhen Lyu,Yinghong Liao,Chaoda Zheng
出处
期刊:IEEE transactions on neural networks and learning systems
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-13
被引量:4
标识
DOI:10.1109/tnnls.2022.3233562
摘要
Single object tracking (SOT) is one of the most active research directions in the field of computer vision. Compared with the 2-D image-based SOT which has already been well-studied, SOT on 3-D point clouds is a relatively emerging research field. In this article, a novel approach, namely, the contextual-aware tracker (CAT), is investigated to achieve a superior 3-D SOT through spatially and temporally contextual learning from the LiDAR sequence. More precisely, in contrast to the previous 3-D SOT methods merely exploiting point clouds in the target bounding box as the template, CAT generates templates by adaptively including the surroundings outside the target box to use available ambient cues. This template generation strategy is more effective and rational than the previous area-fixed one, especially when the object has only a small number of points. Moreover, it is deduced that LiDAR point clouds in 3-D scenes are often incomplete and significantly vary from frame to another, which makes the learning process more difficult. To this end, a novel cross-frame aggregation (CFA) module is proposed to enhance the feature representation of the template by aggregating the features from a historical reference frame. Leveraging such schemes enables CAT to achieve a robust performance, even in the case of extremely sparse point clouds. The experiments confirm that the proposed CAT outperforms the state-of-the-art methods on both the KITTI and NuScenes benchmarks, achieving 3.9% and 5.6% improvements in terms of precision.
科研通智能强力驱动
Strongly Powered by AbleSci AI