Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

蒸馏 初始化 集成学习 计算机科学 人工神经网络 人工智能 机器学习 Boosting(机器学习) 集合预报 试验装置 深度学习 试验数据 化学 有机化学 程序设计语言
作者
Zeyuan Allen-Zhu,Yuanzhi Li
出处
期刊:Cornell University - arXiv 被引量:36
标识
DOI:10.48550/arxiv.2012.09816
摘要

We formally study how ensemble of deep learning models can improve test accuracy, and how the superior performance of ensemble can be distilled into a single model using knowledge distillation. We consider the challenging case where the ensemble is simply an average of the outputs of a few independently trained neural networks with the SAME architecture, trained using the SAME algorithm on the SAME data set, and they only differ by the random seeds used in the initialization. We show that ensemble/knowledge distillation in Deep Learning works very differently from traditional learning theory (such as boosting or NTKs, neural tangent kernels). To properly understand them, we develop a theory showing that when data has a structure we refer to as ``multi-view'', then ensemble of independently trained neural networks can provably improve test accuracy, and such superior test accuracy can also be provably distilled into a single model by training a single model to match the output of the ensemble instead of the true label. Our result sheds light on how ensemble works in deep learning in a way that is completely different from traditional theorems, and how the ``dark knowledge'' is hidden in the outputs of the ensemble and can be used in distillation. In the end, we prove that self-distillation can also be viewed as implicitly combining ensemble and knowledge distillation to improve test accuracy.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
风和日丽发布了新的文献求助10
2秒前
科研通AI5应助YY采纳,获得10
3秒前
绿兔子完成签到,获得积分10
4秒前
Huang完成签到 ,获得积分0
4秒前
蓝色发布了新的文献求助10
4秒前
4秒前
5秒前
细心的雨竹完成签到,获得积分10
5秒前
Yang完成签到 ,获得积分10
6秒前
阿姊完成签到 ,获得积分10
7秒前
充电宝应助滴滴滴采纳,获得10
7秒前
7秒前
西西发布了新的文献求助10
9秒前
科研完成签到,获得积分10
9秒前
Joy发布了新的文献求助10
11秒前
11秒前
12秒前
12秒前
az完成签到,获得积分10
12秒前
Grace0610完成签到,获得积分10
13秒前
14秒前
大意的蛋挞完成签到 ,获得积分10
14秒前
蓝色发布了新的文献求助10
16秒前
科研通AI5应助rpe采纳,获得10
16秒前
Grace0610发布了新的文献求助10
19秒前
章铭-111发布了新的文献求助10
19秒前
Joy完成签到,获得积分10
20秒前
RDK完成签到,获得积分10
20秒前
21秒前
大个应助欢喜采纳,获得30
21秒前
积极玲完成签到,获得积分20
21秒前
22秒前
小白完成签到 ,获得积分10
22秒前
义气安露发布了新的文献求助10
26秒前
Joker完成签到,获得积分10
27秒前
蓝色发布了新的文献求助10
27秒前
思源应助漫步海滩采纳,获得10
30秒前
冷山完成签到 ,获得积分20
30秒前
30秒前
高分求助中
Basic Discrete Mathematics 1000
Technologies supporting mass customization of apparel: A pilot project 600
Introduction to Strong Mixing Conditions Volumes 1-3 500
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
The Healthy Socialist Life in Maoist China, 1949–1980 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3799095
求助须知:如何正确求助?哪些是违规求助? 3344848
关于积分的说明 10321650
捐赠科研通 3061268
什么是DOI,文献DOI怎么找? 1680100
邀请新用户注册赠送积分活动 806904
科研通“疑难数据库(出版商)”最低求助积分说明 763445