计算机科学
稳健性(进化)
生成语法
一般化
人工智能
对抗制
变压器
机器学习
概化理论
数据科学
化学
数学分析
生物化学
物理
统计
数学
量子力学
电压
基因
作者
Jiameng Pu,Zain Sarwar,Sifat Muhammad Abdullah,Abdullah Rehman,Yoonjin Kim,Parantapa Bhattacharya,Mobin Javed,Bimal Viswanath
标识
DOI:10.1109/sp46215.2023.10179387
摘要
Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed for deepfake text detection. However, we lack a thorough understanding of their real-world applicability. In this paper, we collect deepfake text from 4 online services powered by Transformer-based tools to evaluate the generalization ability of the defenses on content in the wild. We develop several low-cost adversarial attacks, and investigate the robustness of existing defenses against an adaptive attacker. We find that many defenses show significant degradation in performance under our evaluation scenarios compared to their original claimed performance. Our evaluation shows that tapping into the semantic information in the text content is a promising approach for improving the robustness and generalization performance of deepfake text detection schemes.
科研通智能强力驱动
Strongly Powered by AbleSci AI