计算机科学
判别式
人工智能
模式识别(心理学)
面部表情
卷积神经网络
模糊逻辑
水准点(测量)
卷积(计算机科学)
语音识别
人工神经网络
大地测量学
地理
作者
Min Wu,Wanjuan Su,Luefeng Chen,Witold Pedrycz,Kaoru Hirota
标识
DOI:10.1109/taffc.2020.2966440
摘要
The two-stage fuzzy fusion based-convolution neural network is proposed for dynamic emotion recognition by using both facial expression and speech modalities, which not only can extract discriminative emotion features which contain spatio-temporal information, but also can effectively fuse facial expression and speech modalities. Moreover, the proposal is able to handle situations where the contributions of each modality data to emotion recognition are very imbalanced. The local binary patterns coming from three orthogonal planes and spectrogram are considered first to extract low-level dynamic emotion, so that the spatio-temporal information of these modalities can be obtained. To reveal more discriminative features, two deep convolution neural networks are constructed to extract high-level emotion semantic features. Moreover, the two stage fuzzy fusion strategy is developed by integrating canonical correlation analysis and fuzzy broad learning system, so as to take into account the correlation and difference between different modal features, as well as handle the ambiguity of emotional state information. The experimental results obtained on benchmark databases show that the accuracies of the proposed method are higher than those of existing methods (such as the hybrid deep model, and the rule-based and machine learning method) on SAVEE, eNTERFACE'05, and AFEW databases.
科研通智能强力驱动
Strongly Powered by AbleSci AI