计算机科学
人工智能
编码器
合并(版本控制)
解析
人脸检测
模式识别(心理学)
空间关系
面部识别系统
特征提取
计算机视觉
变压器
自然语言处理
特征(语言学)
情报检索
量子力学
语言学
哲学
物理
电压
操作系统
作者
Yuting Xu,Gengyun Jia,Huaibo Huang,Junxian Duan,Ran He
标识
DOI:10.1109/ijcb52358.2021.9484407
摘要
This paper proposes a novel Visual-Semantic Transformer (VST) to detect face forgery based on semantic aware feature relations. In face images, intrinsic feature relations exist between different semantic parsing regions. We find that face forgery algorithms always change such relations. Therefore, we start the approach by extracting Contextual Feature Sequence (CFS) using a transformer encoder to make the best abnormal feature relation patterns. Meanwhile, images are segmented as soft face regions by a face parsing module. Then we merge the CFS and the soft face regions as Visual Semantic Sequences (VSS) representing features of semantic regions. The VSS is fed into the transformer decoder, in which the relations in the semantic region level are modeled. Our method achieved 99.58% accuracy on FF++(Raw) and 96.16% accuracy on Celeb-DF. Extensive experiments demonstrate that our framework outperforms or is comparable with state-of-the-art detection methods, especially towards unseen forgery methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI