发音
计算机科学
正规化(语言学)
对比分析
人工智能
自然语言处理
语音识别
语言学
哲学
作者
Bicheng Yan,Wei-Cheng Chao,Jiun-Ting Li,Yicheng Wang,Hsin‐Wei Wang,Meng-Shin Lin,Berlin Chen
出处
期刊:Cornell University - arXiv
日期:2024-06-04
被引量:1
标识
DOI:10.48550/arxiv.2406.02859
摘要
Automatic pronunciation assessment (APA) manages to evaluate the pronunciation proficiency of a second language (L2) learner in a target language. Existing efforts typically draw on regression models for proficiency score prediction, where the models are trained to estimate target values without explicitly accounting for phoneme-awareness in the feature space. In this paper, we propose a contrastive phonemic ordinal regularizer (ConPCO) tailored for regression-based APA models to generate more phoneme-discriminative features while considering the ordinal relationships among the regression targets. The proposed ConPCO first aligns the phoneme representations of an APA model and textual embeddings of phonetic transcriptions via contrastive learning. Afterward, the phoneme characteristics are retained by regulating the distances between inter- and intra-phoneme categories in the feature space while allowing for the ordinal relationships among the output targets. We further design and develop a hierarchical APA model to evaluate the effectiveness of our method. Extensive experiments conducted on the speechocean762 benchmark dataset suggest the feasibility and efficacy of our approach in relation to some cutting-edge baselines.
科研通智能强力驱动
Strongly Powered by AbleSci AI