话语
计算机科学
语音识别
循环神经网络
瓶颈
人工神经网络
人工智能
语言模型
嵌入式系统
作者
Zhanyu Ma,Hong Yu,Wei Chen,Jun Guo
出处
期刊:IEEE Transactions on Vehicular Technology
[Institute of Electrical and Electronics Engineers]
日期:2018-11-06
卷期号:68 (1): 121-128
被引量:89
标识
DOI:10.1109/tvt.2018.2879361
摘要
Conversations in the intelligent vehicles are usually short utterance. As the durations of the short utterances are small (e.g., less than 3 s), it is difficult to learn sufficient information to distinguish the type of languages. In this paper, we propose an end-to-end short utterances based speech language identification (SLI) approach, which is especially suitable for the short utterance based language identification. This approach is implemented with a long short-term memory (LSTM) neural network, which is designed for the SLI application in intelligent vehicles. The features used for LSTM learning are generated by a transfer learning method. The bottleneck features of a deep neural network, which are obtained for a mandarin acoustic-phonetic classifier, are used for the LSTM training. In order to improve the SLD accuracy with short utterances, a phase vocoder based time-scale modification method is utilized to reduce/increase the speech rate of the test utterance. By connecting the normal, speech rate reduced, and speech rate increased utterances, we can extend the length of the test utterances such that the performance of the SLI system is improved. The experimental results on the AP17-OLR database demonstrate that the proposed method can improve the performance of SLD, especially on short utterance. The proposed SLI has robust performance under the vehicular noisy environment.
科研通智能强力驱动
Strongly Powered by AbleSci AI