发音
计算机科学
概化理论
自然语言处理
人工智能
集合(抽象数据类型)
特征(语言学)
语言模型
语音识别
语言学
数学
哲学
统计
程序设计语言
作者
Binghuai Lin,Liyuan Wang
标识
DOI:10.1109/icassp49357.2023.10095673
摘要
Automatic pronunciation assessment is commonly trained and applied for a specific language, which is not practical in multi-lingual or low-resource scenarios. In this paper, we propose a unified method to take advantage of multi-lingual data for multi-lingual pronunciation assessment. To this end, we first construct a concise unified phoneme set for multi-lingual phoneme recognition based on a pre-trained acoustic model. In this way we can not only share language-independent knowledge but also try to discriminate language-specific information for pronunciation assessment. Second, we employ language-specific embeddings for different languages, which act like language-specific assessment criteria to adaptively adjust the feature weights based on an attention mechanism. The whole network is optimized in a unified framework. Experimental results based on multi-lingual datasets demonstrate its superiority to different baselines in Pearson correlation coefficient (PCC). We also illustrate the generalizability of the proposed method for both seen and unseen data.
科研通智能强力驱动
Strongly Powered by AbleSci AI