计算机科学
动作识别
突出
频道(广播)
人工智能
特征(语言学)
动作(物理)
相关性(法律)
空间分析
模式识别(心理学)
计算机网络
遥感
班级(哲学)
地理
物理
量子力学
哲学
政治学
法学
语言学
作者
Yiying Li,Yulin Li,Yanfei Gu
标识
DOI:10.1145/3404555.3404592
摘要
Recent years have witnessed the effective of attention network based on two-stream for video action recognition. However, most methods adopt the same structure on spatial stream and temporal stream, which produce amount redundant information and often ignore the relevance among channels. In this paper, we propose a channel-wise spatial attention with spatiotemporal heterogeneous framework, a new approach to action recognition. First, we employ two different network structures for spatial stream and temporal stream to improve the performance of action recognition. Then, we design a channel-wise network and spatial network inspired by self-attention mechanism to obtain the fine-grained and salient information of the video. Finally, the feature of video for action recognition is generated by end-to-end training. Experimental results on the datasets HMDB51 and UCF101 shows our method can effectively recognize the actions in the video.
科研通智能强力驱动
Strongly Powered by AbleSci AI