语音识别
计算机科学
Mel倒谱
特征提取
人工智能
隐马尔可夫模型
特征(语言学)
人工神经网络
模式识别(心理学)
加权
预处理器
噪音(视频)
医学
语言学
哲学
图像(数学)
放射科
作者
Jie Liang,Xiaoyan Zhao,Zhaohui Zhang
标识
DOI:10.1109/ccdc49329.2020.9164823
摘要
With the development of information technology, speech emotion recognition technology was applied to the classroom evaluation, which is helpful to improve teaching quality by analyzing and quantifying evaluation indexes in real time. The paper studied teachers' speech signals and a set of emotion detection audio processing system was designed. The teachers' speech was used to judge their emotions. The recurrent neural network (RNN) algorithm was used to construct a speech emotion recognition classification model. Emotions were reclassified based on preprocessing of original data such as pre-weighting, frame-adding window and endpoint detection, so as to establish the speech emotion corpus of teacher evaluation system. By improving the traditional feature extraction process of Mel Frequency Cepstral Coefficents(MFCC), the second-order differential process was added to eliminate the convolution noise of MFCC. Especially, the 1-dimensional energy feature is added to the 39-dimensional MFCC coefficient for experiment, and the results showed that the average recognition rate of the 40-dimensional feature parameter improved 2.53% than the 39-dimensional parameter. Through experiments on the unit network structure of the classification model, the Long Short-Term Memory (LSTM) optimization model was obtained, and the average recognition rate of the five kinds of speech emotion classification reached 85.32%. Experiments showed that the improved MFCC feature value and neural network can improve the recognition rate of speech emotion more effectively than the traditional speech emotion recognition method, which can be used for speech emotion recognition in classroom teaching.
科研通智能强力驱动
Strongly Powered by AbleSci AI