计算机科学
动作识别
图形
卷积(计算机科学)
人工智能
模式识别(心理学)
骨架(计算机编程)
推论
卷积神经网络
理论计算机科学
人工神经网络
程序设计语言
班级(哲学)
作者
Maosen Li,Siheng Chen,Chen Xu,Ya Zhang,Yanfeng Wang,Qi Tian
标识
DOI:10.1109/cvpr.2019.00371
摘要
Action recognition with skeleton data has recently attracted much attention in computer vision. Previous studies are mostly based on fixed skeleton graphs, only capturing local physical dependencies among joints, which may miss implicit joint correlations. To capture richer dependencies, we introduce an encoder-decoder structure, called A-link inference module, to capture action-specific latent dependencies, i.e. actional links, directly from actions. We also extend the existing skeleton graphs to represent higher-order dependencies, i.e. structural links. Combing the two types of links into a generalized skeleton graph, We further propose the actional-structural graph convolution network (AS-GCN), which stacks actional-structural graph convolution and temporal convolution as a basic building block, to learn both spatial and temporal features for action recognition. A future pose prediction head is added in parallel to the recognition head to help capture more detailed action patterns through self-supervision. We validate AS-GCN in action recognition using two skeleton data sets, NTU-RGB+D and Kinetics. The proposed AS-GCN achieves consistently large improvement compared to the state-of-the-art methods. As a side product, AS-GCN also shows promising results for future pose prediction.
科研通智能强力驱动
Strongly Powered by AbleSci AI