计算机科学
人工智能
超图
RGB颜色模型
人工神经网络
模式识别(心理学)
钥匙(锁)
空间分析
编码(内存)
机器学习
数学
离散数学
遥感
计算机安全
地质学
作者
Zhixuan Wu,Nan Ma,Zhi Tao,Genbao Xu
标识
DOI:10.1109/dsa59317.2023.00068
摘要
To address the issue of large network computing parameters for spatial-temporal features of actions in multi-view video sequences, this paper proposes a Spatial-Temporal Hypergraph Neural Network based on Attention Mechanism (STHGNN-AM). This method consists of a Temporal Attention Mechanism based on Trainable Threshold (TAM-TT) and a Multi-scale Spatial-Temporal Residual Module (MS-STRM), achieving multi-view data action recognition. Specifically, TAM-TT is constructed by using a learnable threshold to extract key frames of actions from different view video frames input to the module. MS-STRM is employed to further improve the model performance, and high-order semantic features of actions are learned in a hypergraph neural network. The MS-STRM extracts features using a multi-scale approach, modeling long-term and short-term semantic information to capture the temporal information changes between different frames. Comparative experiments on the NTU RGB+D and imitating traffic police gestures datasets evince the superior performance and heightened recognition accuracy exhibited by the proposed methodology, effectively enhancing the cognitive ability of machine-human body language interaction.
科研通智能强力驱动
Strongly Powered by AbleSci AI