计算机科学
随机梯度下降算法
梯度下降
趋同(经济学)
编码(集合论)
人气
人工智能
人工神经网络
梯度法
收敛速度
机器学习
数学优化
算法
钥匙(锁)
数学
社会心理学
经济增长
经济
集合(抽象数据类型)
程序设计语言
计算机安全
心理学
作者
Ilya Loshchilov,Frank Hutter
出处
期刊:Cornell University - arXiv
日期:2016-01-01
被引量:3329
标识
DOI:10.48550/arxiv.1608.03983
摘要
Restart techniques are common in gradient-free optimization to deal with multimodal functions. Partial warm restarts are also gaining popularity in gradient-based optimization to improve the rate of convergence in accelerated gradient schemes to deal with ill-conditioned functions. In this paper, we propose a simple warm restart technique for stochastic gradient descent to improve its anytime performance when training deep neural networks. We empirically study its performance on the CIFAR-10 and CIFAR-100 datasets, where we demonstrate new state-of-the-art results at 3.14% and 16.21%, respectively. We also demonstrate its advantages on a dataset of EEG recordings and on a downsampled version of the ImageNet dataset. Our source code is available at https://github.com/loshchil/SGDR
科研通智能强力驱动
Strongly Powered by AbleSci AI