计算机科学
人工智能
特征(语言学)
特征提取
领域(数学分析)
一般化
模式识别(心理学)
面子(社会学概念)
特征学习
代表(政治)
情态动词
任务(项目管理)
机器学习
数据挖掘
数学
社会学
哲学
经济
数学分析
化学
管理
高分子化学
法学
政治
语言学
社会科学
政治学
作者
Zhixiao Fu,Xinyuan Chen,Daizong Liu,Xiaoye Qu,Jianfeng Dong,Xuhong Zhang,Shouling Ji
标识
DOI:10.1016/j.imavis.2023.104686
摘要
Synthesizing videos with forged faces is a fundamental yet important safety-critical task that has caused severe security issues in recent years. Although many existing face forgery detection methods have achieved superior performance on such synthetic videos, they are severely limited by the domain-specific training data and generally perform unsatisfied when transferred to the cross-dataset scenario due to the domain gaps. Based on this observation, in this paper, we propose a multi-level feature disentanglement network to be robust to this domain bias induced by the different types of fake artifacts in different datasets. Specifically, we first detect the face image and transform it into both color-aware and frequency-aware inputs for multi-modal contextual representation learning. Then, we introduce a novel feature disentangling module that mainly utilizes a pair of complementary attention maps, to disentangle the synthetic features into separate realistic features and the features of fake artifacts. Since the features of fake artifacts are indirectly obtained from the latent features instead of the dataset-specific distribution, our forgery detection model is robust to the dataset-specific domain gaps. By applying the disentangling module to multi-levels of the feature extraction network with multi-modal inputs, we can obtain more robust feature representations. In addition, a realistic-aware adversary loss and a domain-aware adversary loss are adopted to facilitate the network for better feature disentanglement and extraction. Extensive experiments on four datasets verify the generalization of our method and present the state-of-the-art performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI