计算机科学
建筑
随机梯度下降算法
信息流
网络体系结构
多样性(控制论)
人工智能
人工神经网络
梯度下降
深层神经网络
分布式计算
计算机体系结构
计算机网络
地理
哲学
考古
语言学
作者
Rupesh K. Srivastava,Klaus Greff,Jürgen Schmidhuber
出处
期刊:Cornell University - arXiv
日期:2015-05-03
被引量:317
摘要
There is plenty of theoretical and empirical evidence that depth of neural networks is a crucial ingredient for their success. However, network training becomes more difficult with increasing depth and training of very deep networks remains an open problem. In this extended abstract, we introduce a new architecture designed to ease gradient-based training of very deep networks. We refer to networks with this architecture as highway networks, since they allow unimpeded information flow across several layers on information highways. The architecture is characterized by the use of gating units which learn to regulate the flow of information through a network. Highway networks with hundreds of layers can be trained directly using stochastic gradient descent and with a variety of activation functions, opening up the possibility of studying extremely deep and efficient architectures.
科研通智能强力驱动
Strongly Powered by AbleSci AI