超参数
过度拟合
计算机科学
人工智能
机器学习
正规化(语言学)
人工神经网络
深度学习
梯度下降
集合(抽象数据类型)
深层神经网络
监督学习
程序设计语言
作者
Mengye Ren,Wenyuan Zeng,Bin Yang,Raquel Urtasun
出处
期刊:Cornell University - arXiv
日期:2018-01-01
被引量:621
标识
DOI:10.48550/arxiv.1803.09050
摘要
Deep neural networks have been shown to be very powerful modeling tools for many supervised learning tasks involving complex input patterns. However, they can also easily overfit to training set biases and label noises. In addition to various regularizers, example reweighting algorithms are popular solutions to these problems, but they require careful tuning of additional hyperparameters, such as example mining schedules and regularization hyperparameters. In contrast to past reweighting methods, which typically consist of functions of the cost value of each example, in this work we propose a novel meta-learning algorithm that learns to assign weights to training examples based on their gradient directions. To determine the example weights, our method performs a meta gradient descent step on the current mini-batch example weights (which are initialized from zero) to minimize the loss on a clean unbiased validation set. Our proposed method can be easily implemented on any type of deep network, does not require any additional hyperparameter tuning, and achieves impressive performance on class imbalance and corrupted label problems where only a small amount of clean validation data is available.
科研通智能强力驱动
Strongly Powered by AbleSci AI