计算机科学
人工智能
计算机视觉
保险丝(电气)
视频跟踪
主动视觉
嵌入
对象(语法)
特征(语言学)
强化学习
跟踪(教育)
可视化
眼动
工程类
心理学
教育学
语言学
哲学
电气工程
作者
Mao Xi,Yun Zhou,Zheng Chen,Wengang Zhou,Houqiang Li
标识
DOI:10.1109/tcsvt.2021.3107153
摘要
In active object tracking, given a visual observation as input, the goal is to lockup the target by autonomously adjusting camera’s position and posture. Previous works on active tracking assume that there is only one object (person) in the environment without distractors. In this work, towards realistic setting, we move forward to a more challenging scenario, where the tracker moves freely in 3D space like unmanned aerial vehicles (UAV) to track a person in various complex scenes with multiple distractors. To this end, we propose a novel end-to-end anti-distractor active object tracking framework by introducing multiple attention modules. On one hand, we take the target template to learn an embedding as channel-wise attention for current observation to distinguish the target from the distractors. On the other hand, temporal attention is introduced to fuse the observation history to extract a feature representation, which is then fed into a reinforcement learning network to output the action of the tracker. To evaluate our method, we build several multi-object 3D environments in Unreal Engine and extensive experiments demonstrate the effectiveness of our approach.
科研通智能强力驱动
Strongly Powered by AbleSci AI