Language Models are Few-Shot Learners

计算机科学 任务(项目管理) 语言模型 自然语言处理 判决 人工智能 词(群论) 简单(哲学) 语言学 认识论 哲学 经济 管理
作者
T. B. Brown,Benjamin F. Mann,Nick Ryder,Melanie Subbiah,Jared Kaplan,Prafulla Dhariwal,Arvind Neelakantan,Pranav Shyam,Girish Sastry,Amanda Askell,Sandhini Agarwal,Ariel Herbert-Voss,Gretchen Krueger,Tom Henighan,Rewon Child,Aditya Ramesh,Daniel M. Ziegler,Jeffrey C.S. Wu,Clemens Winter,Christopher Hesse
出处
期刊:Cornell University - arXiv 被引量:13716
标识
DOI:10.48550/arxiv.2005.14165
摘要

Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
赵云江完成签到,获得积分10
刚刚
稀罕你发布了新的文献求助10
4秒前
Lucas应助种花家的狗狗采纳,获得10
4秒前
SUYIYUZ完成签到,获得积分10
7秒前
7秒前
JamesPei应助hli采纳,获得10
8秒前
斯文败类应助海遥采纳,获得10
9秒前
10秒前
端庄凌文发布了新的文献求助10
11秒前
13秒前
天天快乐应助123采纳,获得10
15秒前
哈哈哈发布了新的文献求助30
16秒前
爱笑的眼睛完成签到,获得积分10
16秒前
谢谢完成签到,获得积分10
18秒前
LiuZhaoYuan发布了新的文献求助10
19秒前
19秒前
刘雪完成签到 ,获得积分10
20秒前
20秒前
Hello应助种花家的狗狗采纳,获得10
21秒前
wanxi1完成签到,获得积分10
22秒前
SUYIYUZ发布了新的文献求助10
23秒前
chenjzhuc完成签到,获得积分10
23秒前
maoxinnan发布了新的文献求助30
24秒前
烟花应助YXH采纳,获得10
24秒前
迷路的初柔完成签到 ,获得积分10
25秒前
呆萌士晋发布了新的文献求助10
25秒前
迪迪张完成签到 ,获得积分10
26秒前
哈哈完成签到 ,获得积分10
28秒前
子系郎完成签到,获得积分10
31秒前
充电宝应助我爱吃肉采纳,获得10
31秒前
32秒前
xzy998应助科研通管家采纳,获得10
34秒前
传奇3应助科研通管家采纳,获得10
34秒前
传奇3应助科研通管家采纳,获得10
34秒前
英俊的铭应助科研通管家采纳,获得10
34秒前
英俊的铭应助科研通管家采纳,获得10
34秒前
xzy998应助科研通管家采纳,获得10
34秒前
科研通AI5应助科研通管家采纳,获得10
34秒前
天天快乐应助科研通管家采纳,获得10
34秒前
34秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Aircraft Engine Design, Third Edition 500
Neonatal and Pediatric ECMO Simulation Scenarios 500
苏州地下水中新污染物及其转化产物的非靶向筛查 500
Rapid Review of Electrodiagnostic and Neuromuscular Medicine: A Must-Have Reference for Neurologists and Physiatrists 500
Vertebrate Palaeontology, 5th Edition 500
碳捕捉技术能效评价方法 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 内科学 生物化学 物理 计算机科学 纳米技术 遗传学 基因 复合材料 化学工程 物理化学 病理 催化作用 免疫学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 4746159
求助须知:如何正确求助?哪些是违规求助? 4094000
关于积分的说明 12665831
捐赠科研通 3805783
什么是DOI,文献DOI怎么找? 2101102
邀请新用户注册赠送积分活动 1126432
关于科研通互助平台的介绍 1002953