计算机科学
人工智能
判别式
模式识别(心理学)
卷积神经网络
特征(语言学)
特征向量
定向梯度直方图
BitTorrent跟踪器
特征提取
眼动
计算机视觉
特征学习
视频跟踪
直方图
对象(语法)
图像(数学)
哲学
语言学
作者
Yi Cao,Hongbing Ji,Wenbo Zhang,Shahram Shirani
出处
期刊:IEEE Transactions on Circuits and Systems for Video Technology
[Institute of Electrical and Electronics Engineers]
日期:2021-03-01
卷期号:32 (2): 674-689
被引量:1
标识
DOI:10.1109/tcsvt.2021.3063001
摘要
Tracking-by-detection algorithms have considerably enhanced tracking performance with the introduction of recent convolutional neural networks (CNNs). However, most trackers directly exploit standard scalar-output CNN features, which may not capture enough feature encoding information, instead of aggregated CNN features of vector-output form. In this paper, we propose an end-to-end feature aggregation capsule framework. First, based on the existing CNN network, we aggregate a certain number of similar position-aware CNN features into a capsule to model the feature similarity. The acquired vector-level feature capsules (rather than previous scalar-level pointwise features) are utilized for differentiation learning. We then propose a group attention module to better model the entity representation between different capsule groups thus optimizes total discriminative capability. Third, to reduce the prediction interference resulted by the side effect of dimension rising within capsules, we propose a penalty attention module. Such strategy could dynamically adjust values of neurons by estimating whether they are beneficial or harmful to tracking. Experimental results on five representative benchmarks (UAVDT, DTB70, UAV123, VOT2016 and VOT2018) demonstrate the excellent tracking performance of our dual attention based capsule tracker (DACapT). Specially, it exceeds the previous top tracker by 4.6%/1.9% in precision/success evaluations on UAVDT.
科研通智能强力驱动
Strongly Powered by AbleSci AI