对抗制
计算机科学
黑匣子
生成语法
生成对抗网络
人工智能
样品(材料)
机器学习
生成模型
自然语言处理
深度学习
化学
色谱法
作者
F. Frank Chen,Zhidong Shen
出处
期刊:Communications in computer and information science
日期:2023-11-27
卷期号:: 295-307
标识
DOI:10.1007/978-981-99-8181-6_23
摘要
BERT and other pre-trained language models are vulnerable to textual adversarial attacks. While current transfer-based textual adversarial attacks in black-box environments rely on real datasets to train substitute models, obtaining the datasets can be challenging for attackers. To address this issue, we propose a data-free substitute training method (DaST-T) for textual adversarial attacks, which can train substitute models without the assistance of real data. DaST-T consists of two major steps. Firstly, DaST-T creates a special Generative Adversarial Network (GAN) to train substitute models without any real data. The training procedure utilizes samples synthesized at random by the generative model, where labels are generated by the attacked model. In particular, DaST-T designs a data augmenter for the generative model to facilitate rapid exploration of the entire sample space, thereby accelerating the performance of substitute model training. Secondly, DaST-T applies existing white-box textual adversarial attack methods to the substitute model to generate adversarial text, which is then migrated to the attacked model. DaST-T can effectively address the issue of limited access to real datasets in black-box textual adversarial attacks. Experimental results on text classification tasks in NLP show that DaST-T can achieve superior attack performance compared to other baselines of black-box textual adversarial attacks while requiring fewer sample queries.
科研通智能强力驱动
Strongly Powered by AbleSci AI