计算机科学
噪音(视频)
数据流
数据流挖掘
人工智能
帧(网络)
数据挖掘
生成语法
RGB颜色模型
模式识别(心理学)
机器学习
图像(数学)
电信
作者
Bing Han,Xiaoguang Han,Hua Zhang,Jingzhi Li,Xiaochun Cao
出处
期刊:IEEE transactions on biometrics, behavior, and identity science
[Institute of Electrical and Electronics Engineers]
日期:2021-03-12
卷期号:3 (3): 320-331
被引量:52
标识
DOI:10.1109/tbiom.2021.3065735
摘要
Benefitting from the development of deep generative networks, modern fake news generation methods called Deepfake rapidly go viral over the Internet, calling for efficient detection methods. Existing Deepfake detection methods basically use binary classification networks trained on frame-level inputs and lack leveraging temporal information in videos. Besides, the accuracy of these methods will rapidly decrease when processing low-quality data. In this work, we propose a two-stream network to detect Deepfake in video level with the capability of handling low-quality data. The proposed architecture firstly divides the input video into segments and then feeds selected frames of each segment into two streams: The first stream takes RGB information as input and tries to learn the semantic inconsistency. The second stream parallelly leverages noise features extracted by spatial rich model (SRM) filters. Additionally, our experiments found that traditional SRM filters with fixed weights contribute insignificant improvement, we thus design novel learnable SRM filters, which can better fit the noise inconsistency in tampered regions. Segmental fusion and stream fusion are conducted at last to combine the information from segments and streams. We evaluate our algorithm on the existing largest Deepfake dataset FaceForensics++ and the experimental results show that we obtain state-of-the-art performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI