RGB颜色模型
特征(语言学)
人工智能
计算机科学
计算机视觉
卷积神经网络
特征提取
特征学习
姿势
深度学习
代表(政治)
模式识别(心理学)
语言学
政治
哲学
法学
政治学
作者
Huiying Li,Qing Lei,Hong-Bo Zhang,Ji-Xiang Du,Shangce Gao
标识
DOI:10.1016/j.jvcir.2022.103625
摘要
Most of the existing Action Quality Assessment (AQA) methods for scoring sports videos have deeply researched how to evaluate the single action or several sequential-defined actions that performed in short-term sport videos, such as diving, vault, etc. They attempted to extract features directly from RGB videos through 3D ConvNets, which makes the features mixed with ambiguous scene information. To investigate the effectiveness of deep pose feature learning on automatically evaluating the complicated activities in long-duration sports videos, such as figure skating and artistic gymnastic, we propose a skeleton-based deep pose feature learning method to address this problem. For pose feature extraction, a spatial–temporal pose extraction module (STPE) is built to capture the subtle changes of human body movements and obtain the detail representations for skeletal data in space and time dimensions. For temporal information representation, an inter-action temporal relation extraction module (ATRE) is implemented by recurrent neural network to model the dynamic temporal structure of skeletal subsequences. We evaluate the proposed method on figure skating activity of MIT-skate and FIS-V datasets. The experimental results show that the proposed method is more effective than RGB video-based deep feature learning methods, including SENet and C3D. Significant performance progress has been achieved for the Spearman Rank Correlation (SRC) on MIT-Skate dataset. On FIS-V dataset, for the Total Element Score (TES) and the Program Component Score (PCS), better SRC and MSE have been achieved between the predicted scores against the judge’s ones when compared with SENet and C3D feature methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI