信息瓶颈法
瓶颈
计算机科学
正规化(语言学)
人工智能
代表(政治)
任务(项目管理)
机器学习
模式识别(心理学)
背景(考古学)
相互信息
生物
政治
嵌入式系统
古生物学
经济
管理
法学
政治学
作者
Jiawei Fan,Yu Zhao,Xie Yu,Lihua Ma,Junqi Liu,Fangqiu Yi,Boxun Li
标识
DOI:10.1145/3503161.3548326
摘要
An optimal representation should contain the maximum task-relevant information and minimum task-irrelevant information, as revealed from Information Bottleneck Principle. In video action recognition, CNN based approaches have obtained better spatio-temporal representation by modeling temporal context. However, these approaches still suffer low generalization. In this paper, we propose a moderate optimization based approach called Dual-view Temporal Regularization (DTR) based on Information Bottleneck Principle for an effective and generalized video representation without sacrificing any efficiency of the model. On the one hand, we design Dual-view Regularization (DR) to constrain task-irrelevant information, which can effectively compress background and irrelevant motion information. On the other hand, we design Temporal Regularization (TR) to maintain task-relevant information by finding an optimal difference between frames, which benefits extracting sufficient motion information. The experimental results demonstrate: (1) DTR is orthogonal to temporal modeling as well as data augmentation, and it achieves general improvement on both model-based and data-based approaches; (2) DTR is effective among 7 different datasets, especially on motion-centric datasets i.e. SSv1/ SSv2, in which DTR gets 6%/3.8% absolute gains in top-1 accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI