计算机科学
卷积(计算机科学)
动作识别
图形
骨架(计算机编程)
人工智能
自然语言处理
模式识别(心理学)
理论计算机科学
程序设计语言
人工神经网络
班级(哲学)
作者
Moyan Zhang,Zhenzhen Quan,Wei Wang,Zhe Chen,Xiaoshan Guo,Yujun Li
标识
DOI:10.1109/jsen.2024.3388154
摘要
In recent years, the field of action recognition using spatio-temporal graph convolution models for human skeletal data has made significant progress. However, current methodologies tend to prioritize spatial graph convolution, which leads to an underutilization of valuable information present in skeletal data. It limits the model's ability to effectively capture complex data patterns, especially in time series data, ultimately impacting recognition accuracy significantly. To address the above issues, this paper introduces an Attention-based Semantic-guided Multi-stream Graph Convolution Network (ASMGCN), which can extract the deep features in skeletal data more fully. Specifically, ASMGCN incorporates a novel temporal convolutional module featuring an attention mechanism and a multiscale residual network, which can dynamically adjust the weights between skeleton graphs at different time points, enabling better capture of relational features. In addition, semantic information is introduced into the loss function, enhancing the model's ability to distinguish similar actions. Furthermore, the coordinate information of different joints within the same frame is explored to generate new relative position features known as centripetal and centrifugal streams based on the center of gravity. These features are integrated with the original position and motion features of skeleton, including joints and bones, enriching the inputs to the GCN network. Experimental results on the NW-UCLA, NTU RGB+D (NTU60) and NTU RGB+D 120 (NTU120) datasets demonstrate that ASMGCN outperforms other state-of-the-art (SOTA) Human Action Recognition (HAR) methods, signifying its potential in advancing the field of action recognition using skeletal data.
科研通智能强力驱动
Strongly Powered by AbleSci AI