计算机科学
发音
隐马尔可夫模型
语音识别
字错误率
人工智能
判别式
人工神经网络
混合模型
连接主义
模式识别(心理学)
语言学
哲学
作者
Mohamed S. Elaraby,Mustafa Abdallah,Sherif Abdou,Mohsen Rashwan
标识
DOI:10.1007/978-3-319-43958-7_5
摘要
Gaussian Mixture Models (GMM) has been the most common used models in pronunciation verification systems. The recently introduced Deep Neural Networks (DNN) has proved to provide significantly better discriminative models of the acoustic space. In this paper, we introduce our efforts to upgrade the models of a Computer Aided Language Learner (CAPL) system that is used to teach the Arabic pronunciation for Quran recitation rules. Four major enhancements were introduced, firstly we used SAT to reduce the inter-speakers variability, secondly, we integrated a hybrid DNN-HMM models to enhance the acoustic model and decrease the phone error rate. Third, we integrated Minimum Phone Error (MPE) with the hybrid DNN. Finally, in the testing phase, we used a grammar-based decoding graph to limit the search space to the frequent errors types. A comparison between the performance of the conventional GMM-HMM and the hybrid DNN-HMM was performed with results showing significant performance improvements.
科研通智能强力驱动
Strongly Powered by AbleSci AI