发布文献求助

Scaling Laws for Transfer

概括性缩放比例计算机科学刮擦学习迁移幂律标度律熵（时间箭头）人工智能数学统计物理几何学操作系统心理治疗师量子力学心理学

作者

Danny Hernandez,Jared Kaplan,Tom Henighan,Sam McCandlish

出处

期刊：Cornell University - arXiv 日期：2021-02-02 被引量：27

链接

摘要

We study empirical scaling laws for transfer learning between distributions in an unsupervised, fine-tuning setting. When we train increasingly large neural networks from-scratch on a fixed-size dataset, they eventually become data-limited and stop improving in performance (cross-entropy loss). When we do the same for models pre-trained on a large language dataset, the slope in performance gains is merely reduced rather than going to zero. We calculate the effective data from pre-training by determining how much data a transformer of the same size would have required to achieve the same loss when training from scratch. In other words, we focus on units of data while holding everything else fixed. We find that the effective data transferred is described well in the low data regime by a power-law of parameter count and fine-tuning dataset size. We believe the exponents in these power-laws correspond to measures of the generality of a model and proximity of distributions (in a directed rather than symmetric sense). We find that pre-training effectively multiplies the fine-tuning dataset size. Transfer, like overall performance, scales predictably in terms of parameters, data, and compute.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2025年影响因子查询已上线 (2025-6-18)

更新

PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 胧雨完成签到，获得积分10

刚刚; 刘老师发布了新的文献求助10

刚刚; 机灵寒烟完成签到，获得积分10

刚刚; bkagyin的应助被奥利奥翠翠饼采纳，获得10

2秒前; 在水一方上传了应助文件

2秒前; xzn1123关闭了小魏小魏的文献求助

2秒前; _蝴蝶小姐发布了新的文献求助10

2秒前; 英姑的应助被永和采纳，获得10

2秒前; 愤怒的小甜瓜完成签到，获得积分10

3秒前; 的服务费完成签到，获得积分10

4秒前; 我是老大的应助被账户已注销采纳，获得10

6秒前; 小羊羔发布了新的文献求助10

7秒前; 星月完成签到，获得积分10

7秒前; Embrace发布了新的文献求助10

8秒前; Tourist上传了应助文件

10秒前; ZZzz完成签到，获得积分10

10秒前; SciGPT上传了应助文件

10秒前; 赘婿上传了应助文件

11秒前; 1Yer6完成签到，获得积分10

11秒前; Jasper上传了应助文件

13秒前; 贤惠的忆南完成签到，获得积分10

13秒前; wsy发布了新的文献求助10

14秒前; yiwan发布了新的文献求助10

15秒前; Okki完成签到，获得积分10

16秒前; 幸福的小刺猬发布了新的文献求助10

16秒前; 量子星尘发布了新的文献求助50

17秒前; 知来者发布了新的文献求助10

17秒前; 丘比特上传了应助文件

17秒前; 曾婉之小汁关闭了曾婉之小汁的文献求助

17秒前; lxc发布了新的文献求助10

18秒前; 凉白开完成签到，获得积分10

19秒前; 蓝桉凯完成签到，获得积分10

19秒前; FashionBoy的应助被科研柠檬精酸酸采纳，获得20

19秒前; 无花果上传了应助文件

20秒前; Embrace完成签到，获得积分10

21秒前; 浮游的应助被羊二呆采纳，获得10

21秒前; 师忆夏完成签到，获得积分10

21秒前; 小包发布了新的文献求助10

23秒前; 清汤锅发布了新的文献求助10

23秒前; 细腻驳完成签到，获得积分10

24秒前

高分求助中: (应助此贴封号)【重要！！请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000; Target genes for RNAi in pest control: A comprehensive overview 600; The Social Work Ethics Casebook(2nd,Frederic G. R) 600; HEAT TRANSFER EQUIPMENT DESIGN Advanced Study Institute Book 500; Pipeline and riser loss of containment 2001 - 2020 (PARLOC 2020) 500; Master Curve-Auswertungen und Untersuchung des Größeneffekts für C(T)-Proben - aktuelle Erkenntnisse zur Untersuchung des Master Curve Konzepts für ferritisches Gusseisen mit Kugelgraphit bei dynamischer Beanspruchung (Projekt MCGUSS) 500; Design and Development of A CMOS Integrated Multimodal Sensor System with Carbon Nano-electrodes for Biosensor Applications 500

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 5109426; 求助须知：如何正确求助？哪些是违规求助？ 4318139; 关于积分的说明 13453709; 捐赠科研通 4148066; 什么是DOI，文献DOI怎么找？ 2273021; 邀请新用户注册赠送积分活动 1275171; 关于科研通互助平台的介绍 1213331

今日热心研友

昏睡的蟠桃

朝阳区李知恩

莫名是个小疯子

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通