Poison Attack and Poison Detection on Deep Source Code Processing Models

计算机科学 源代码 计算机安全 深度学习 编码(集合论) 可解释性 脆弱性(计算) 代码库 人工智能 程序设计语言 集合(抽象数据类型)
作者
Jia Li,Zhuo Li,Huangzhao Zhang,Ge Li,Zhi Jin,Xing Hu,Xin Xia
出处
期刊:ACM Transactions on Software Engineering and Methodology [Association for Computing Machinery]
卷期号:33 (3): 1-31 被引量:7
标识
DOI:10.1145/3630008
摘要

In the software engineering (SE) community, deep learning (DL) has recently been applied to many source code processing tasks, achieving state-of-the-art results. Due to the poor interpretability of DL models, their security vulnerabilities require scrutiny. Recently, researchers have identified an emergent security threat to DL models, namely, poison attacks . The attackers aim to inject insidious backdoors into DL models by poisoning the training data with poison samples. The backdoors mean that poisoned models work normally with clean inputs but produce targeted erroneous results with inputs embedded with specific triggers. By using triggers to activate backdoors, attackers can manipulate poisoned models in security-related scenarios (e.g., defect detection) and lead to severe consequences. To verify the vulnerability of deep source code processing models to poison attacks, we present a poison attack approach for source code named CodePoisoner as a strong imaginary enemy. CodePoisoner can produce compilable and functionality-preserving poison samples and effectively attack deep source code processing models by poisoning the training data with poison samples. To defend against poison attacks, we further propose an effective poison detection approach named CodeDetector . CodeDetector can automatically identify poison samples in the training data. We apply CodePoisoner and CodeDetector to six deep source code processing models, including defect detection, clone detection, and code repair models. The results show that ❶ CodePoisoner conducts successful poison attacks with a high attack success rate (average: 98.3%, maximum: 100%). It validates that existing deep source code processing models have a strong vulnerability to poison attacks. ❷ CodeDetector effectively defends against multiple poison attack approaches by detecting (maximum: 100%) poison samples in the training data. We hope this work can help SE researchers and practitioners notice poison attacks and inspire the design of more advanced defense techniques.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
LisA__完成签到,获得积分10
1秒前
健壮听筠发布了新的文献求助10
1秒前
zdyw发布了新的文献求助30
2秒前
大梅子清清淡淡完成签到,获得积分10
2秒前
花开花落花无悔完成签到 ,获得积分10
2秒前
Friday发布了新的文献求助10
2秒前
11完成签到,获得积分10
2秒前
彭彭完成签到,获得积分10
2秒前
ZSS发布了新的文献求助10
2秒前
2秒前
Gloria完成签到,获得积分10
2秒前
Qiu完成签到,获得积分10
2秒前
Lychee完成签到,获得积分10
3秒前
3秒前
唐唐完成签到 ,获得积分10
3秒前
小北发布了新的文献求助10
4秒前
完美大米发布了新的文献求助10
4秒前
张天赐完成签到,获得积分10
5秒前
共享精神应助鎏祈采纳,获得10
5秒前
Zhao发布了新的文献求助10
5秒前
7秒前
怕孤独的凝海完成签到,获得积分10
7秒前
聪慧石头完成签到,获得积分10
7秒前
拽拽也是猫猫完成签到,获得积分10
7秒前
圥忈完成签到,获得积分10
7秒前
现实的天蓝完成签到,获得积分10
8秒前
8秒前
8秒前
8秒前
笨笨的微笑完成签到,获得积分10
8秒前
莫飞完成签到,获得积分10
8秒前
堇妗完成签到,获得积分10
9秒前
科目三应助明亮采纳,获得10
9秒前
9秒前
时尚的日记本完成签到,获得积分10
9秒前
sasa完成签到,获得积分10
10秒前
10秒前
10秒前
10秒前
科研通AI6.1应助追风少年采纳,获得10
11秒前
高分求助中
Annie Ernaux: De la perte au corps glorieux 600
Petrology and Plate Tectonics,2025 500
Optical Coating Design with the Essential Macleod 400
A revision of Limenitis helmanni and its related species (Nymphalidae) from Central and South China 400
Moore's Clinically Oriented Anatomy 10th Edition 400
Direct and Iterative Linear System Solvers 400
Cardiopulmonary Bypass and Mechanical Support: Principles and Practice, Fifth Edition 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6784665
求助须知:如何正确求助?哪些是违规求助? 8506780
关于积分的说明 18117187
捐赠科研通 6090095
什么是DOI,文献DOI怎么找? 3019760
邀请新用户注册赠送积分活动 1996736
关于科研通互助平台的介绍 1982883