困惑
计算机科学
序列(生物学)
残余物
人工智能
深度学习
图层(电子)
网络模型
语言模型
算法
遗传学
化学
有机化学
生物
作者
Hongyan Hao,Yan Wang,Yudi Xia,Jian Zhao,Furao Shen
出处
期刊:Cornell University - arXiv
日期:2020-01-01
被引量:31
标识
DOI:10.48550/arxiv.2002.12530
摘要
With the development of feed-forward models, the default model for sequence modeling has gradually evolved to replace recurrent networks. Many powerful feed-forward models based on convolutional networks and attention mechanism were proposed and show more potential to handle sequence modeling tasks. We wonder that is there an architecture that can not only achieve an approximate substitution of recurrent network, but also absorb the advantages of feed-forward models. So we propose an exploratory architecture referred to Temporal Convolutional Attention-based Network (TCAN) which combines temporal convolutional network and attention mechanism. TCAN includes two parts, one is Temporal Attention (TA) which captures relevant features inside the sequence, the other is Enhanced Residual (ER) which extracts shallow layer's important information and transfers to deep layers. We improve the state-of-the-art results of bpc/perplexity to 30.28 on word-level PTB, 1.092 on character-level PTB, and 9.20 on WikiText-2.
科研通智能强力驱动
Strongly Powered by AbleSci AI