判别式
计算机科学
对抗制
人工智能
动作(物理)
背景(考古学)
机器学习
代表(政治)
动作识别
班级(哲学)
古生物学
物理
法学
政治
生物
量子力学
政治学
作者
Na Zheng,Xuemeng Song,Tianyu Su,Weifeng Liu,Yan Yan,Liqiang Nie
摘要
Egocentric early action prediction aims to recognize actions from the first-person view by only observing a partial video segment, which is challenging due to the limited context information of the partial video. In this article, to tackle the egocentric early action prediction problem, we propose a novel multi-modal adversarial knowledge distillation framework. In particular, our approach involves a teacher network to learn the enhanced representation of the partial video by considering the future unobserved video segment, and a student network to mimic the teacher network to produce the powerful representation of the partial video and based on that predicting the action label. To promote the knowledge distillation between the teacher and the student network, we seamlessly integrate adversarial learning with latent and discriminative knowledge regularizations encouraging the learned representations of the partial video to be more informative and discriminative toward the action prediction. Finally, we devise a multi-modal fusion module toward comprehensively predicting the action label. Extensive experiments on two public egocentric datasets validate the superiority of our method over the state-of-the-art methods. We have released the codes and involved parameters to benefit other researchers. 1
科研通智能强力驱动
Strongly Powered by AbleSci AI