构音障碍
普通话
计算机科学
语音识别
特征选择
发音
均方误差
特征(语言学)
人工智能
回归
回归分析
模式识别(心理学)
机器学习
统计
数学
心理学
语言学
哲学
精神科
作者
Shangjun Lu,Xiaoxia Du,Juan Liu,Yumei Zhang,Shaofeng Zhao,Rongfeng Su,Lan Wang,Nan Yan
标识
DOI:10.1109/ialp57159.2022.9961300
摘要
Automatic objective assessment of dysarthria is valuable and crucial. Most previous studies focus on using audio-only data, ignoring the complementary of other modal data. In addition, traditional methods ignore the relationship between the pre-defined features and different pronunciations, reducing the performance of the automatic assessment system. To address these issues, this paper proposes a joint feature-sample selection (JFSS) based dysarthria severity level regression model using audio-visual data. In the proposed framework, relevant pronunciation samples and features are simultaneously obtained and unreliable noisy samples are discarded by the JFSS method. On the Mandarin Subacute Stroke Dysarthria Multimodal (MSDM) Database, the proposed regression model outperformed several baseline models. By using acoustic-visual features, the root mean square error (RMSE) of 13.78 and fitting coefficient R-square of 0.77 computed between the automatically predicted and perceptual evaluation metrics (i.e. Frenchay Dysarthria Assessment) were obtained, which confirmed the capacity of the proposed JFSS-based regression method in predicting dysarthria severity level.
科研通智能强力驱动
Strongly Powered by AbleSci AI