已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

Safety at Scale: A Comprehensive Survey of Large Model Safety

比例(比率) 地理 地图学
作者
Yu–Gang Jiang,Xingjun Ma,Yifeng Gao,Yixu Wang,Ruofan Wang,Xin Wang,Ye Sun,Yifan Ding,Hengyuan Xu,Yunhao Chen,Yunhan Zhao,Hanxun Huang,Yijuan Li,Jiaming Zhang,Xiang Zheng,Yang Bai,Hang Xu,Xipeng Qiu,Jingfeng Zhang,Yiming Yi
标识
DOI:10.70777/si.v2i2.14609
摘要

Abstract—The rapid advancement of large models, driven by their exceptional abilities in learning and generalization through large-scale pre-training, has reshaped the landscape of Artificial Intelligence (AI). These models are now foundational to a wide range of applications, including conversational AI, recommendation systems, autonomous driving, content generation, medical diagnostics, and scientific discovery. However, their widespread deployment also exposes them to significant safety risks, raising concerns about robustness, reliability, and ethical implications. This survey provides a systematic review of current safety research on large models, covering Vision Foundation Models (VFMs), Large Language Models (LLMs), Vision-Language Pre-training (VLP) models, Vision-Language Models (VLMs), Diffusion Models (DMs), and large-model-based Agents. Our contributions are summarized as follows: (1) We present a comprehensive taxonomy of safety threats to these models, including adversarial attacks, data poisoning, backdoor attacks, jailbreak and prompt injection attacks, energy-latency attacks, data and model extraction attacks, and emerging agent-specific threats. (2) We review defense strategies proposed for each type of attacks if available and summarize the commonly used datasets and benchmarks for safety research. (3) Building on this, we identify and discuss the open challenges in large model safety, emphasizing the need for comprehensive safety evaluations, scalable and effective defense mechanisms, and sustainable data practices. More importantly, we highlight the necessity of collective efforts from the research community and international collaboration. Our work can serve as a useful reference for researchers and practitioners, fostering the ongoing development of comprehensive defense systems and platforms to safeguard AI models. GitHub: https://github.com/xingjunm/Awesome-Large-Model-Safety. Index Terms—Large Model Safety, AI Safety, Attacks and Defenses

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
1秒前
美满的砖头完成签到 ,获得积分10
1秒前
平凡完成签到,获得积分10
3秒前
怕黑平蓝发布了新的文献求助10
4秒前
久桃完成签到,获得积分20
4秒前
寒玉发布了新的文献求助10
4秒前
li发布了新的文献求助10
6秒前
8秒前
9秒前
10秒前
jinjun发布了新的文献求助10
13秒前
小何发布了新的文献求助10
15秒前
Wwt发布了新的文献求助10
17秒前
Bestlvy完成签到,获得积分10
17秒前
整齐的飞兰完成签到 ,获得积分10
17秒前
bbhk完成签到,获得积分10
19秒前
23秒前
不安的浩宇关注了科研通微信公众号
25秒前
yangzai完成签到 ,获得积分10
26秒前
怕黑平蓝完成签到,获得积分10
27秒前
jinjun完成签到,获得积分10
28秒前
28秒前
HenryChan完成签到,获得积分10
28秒前
大力三问完成签到 ,获得积分10
29秒前
HenryChan发布了新的文献求助10
32秒前
小二郎应助威武的涵梅采纳,获得10
32秒前
35秒前
ZHH完成签到,获得积分10
35秒前
joker完成签到 ,获得积分10
37秒前
GIINJIU应助ycy采纳,获得10
37秒前
科研通AI5应助Wwt采纳,获得10
38秒前
sky11完成签到 ,获得积分10
39秒前
寒冷的青筠完成签到 ,获得积分10
43秒前
43秒前
47秒前
48秒前
善学以致用应助小何采纳,获得10
50秒前
燕玲发布了新的文献求助10
53秒前
量子星尘发布了新的文献求助10
54秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Nuclear Fuel Behaviour under RIA Conditions 500
Sociologies et cosmopolitisme méthodologique 400
Why America Can't Retrench (And How it Might) 400
Another look at Archaeopteryx as the oldest bird 390
Parenchymal volume and functional recovery after clamped partial nephrectomy: potential discrepancies 300
Optimization and Learning via Stochastic Gradient Search 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 催化作用 遗传学 冶金 电极 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 4681321
求助须知:如何正确求助?哪些是违规求助? 4057211
关于积分的说明 12544797
捐赠科研通 3752306
什么是DOI,文献DOI怎么找? 2072338
邀请新用户注册赠送积分活动 1101374
科研通“疑难数据库(出版商)”最低求助积分说明 980736