超图
计算机科学
成对比较
特征(语言学)
相似性(几何)
构造(python库)
理论计算机科学
人工智能
卷积(计算机科学)
聚类分析
模式识别(心理学)
图像(数学)
人工神经网络
数学
语言学
哲学
离散数学
程序设计语言
作者
Xiaolin Zhu,Dongli Wang,Jianxun Li,Rui Su,Qin Wan,Yan Zhou
标识
DOI:10.1109/tnnls.2024.3422265
摘要
Recently, group activity recognition (GAR) has drawn growing interests in video analysis and computer vision communities. The current models of GAR tasks are often impractical in that they suppose that all interactions between actors are pairwise, which only models and leverages part of the information in real entire interactions. Motivated by this, we design a distinct dynamical attention hypergraph convolutional network framework, referred to as DAHGCN, for precise GAR, modeling the entire interactions and capturing the high-order relationships among involved actors in a real-life scenario. Specifically, to learn complementary feature representations for fine-grained GAR, a multilevel feature descriptor (MLFD) module is proposed. Furthermore, for learning higher order interaction relationships, we construct a DAHGCN to accommodate complex group interactions, which can dynamically change the topology of the hypergraph and learn these key representations by virtue of the "similarity-based shared nearest-neighbor (SSNN) clustering" and "attention mechanisms" on hypergraph. Finally, a multiscale temporal convolution (MSTC) module is utilized to explore various long-range temporal dynamic correlations across different frames. In addition, comprehensive experiments on three commonly used GAR datasets clearly demonstrate that, when compared with the state-of-the-art methods, our proposed method can achieve the most optimal performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI