循环神经网络
计算机科学
架空(工程)
财产(哲学)
常微分方程
期限(时间)
理论(学习稳定性)
计算
人工神经网络
动力系统理论
人工智能
差速器(机械装置)
动力系统(定义)
微分方程
机器学习
理论计算机科学
算法
数学
操作系统
哲学
工程类
航空航天工程
数学分析
物理
量子力学
认识论
作者
Bo Chang,Minmin Chen,Eldad Haber,Ed H.
出处
期刊:Cornell University - arXiv
日期:2019-02-26
被引量:86
标识
DOI:10.48550/arxiv.1902.09689
摘要
Recurrent neural networks have gained widespread use in modeling sequential data. Learning long-term dependencies using these models remains difficult though, due to exploding or vanishing gradients. In this paper, we draw connections between recurrent networks and ordinary differential equations. A special form of recurrent networks called the AntisymmetricRNN is proposed under this theoretical framework, which is able to capture long-term dependencies thanks to the stability property of its underlying differential equation. Existing approaches to improving RNN trainability often incur significant computation overhead. In comparison, AntisymmetricRNN achieves the same goal by design. We showcase the advantage of this new architecture through extensive simulations and experiments. AntisymmetricRNN exhibits much more predictable dynamics. It outperforms regular LSTM models on tasks requiring long-term memory and matches the performance on tasks where short-term dependencies dominate despite being much simpler.
科研通智能强力驱动
Strongly Powered by AbleSci AI