悲伤
模式
面部表情
语音识别
计算机科学
稳健性(进化)
幸福
愤怒
面部识别系统
情绪识别
人工智能
情感计算
特征提取
心理学
化学
社会学
精神科
基因
社会心理学
生物化学
社会科学
作者
Carlos Busso,Zhigang Deng,Serdar Yıldırım,Murtaza Bulut,Chul Min Lee,Abe Kazemzadeh,Sungbok Lee,Ulrich Neumann,Shrikanth Narayanan
标识
DOI:10.1145/1027933.1027968
摘要
The interaction between human beings and computers will be more natural if computers are able to perceive and respond to human non-verbal communication such as emotions. Although several approaches have been proposed to recognize human emotions based on facial expressions or speech, relatively limited work has been done to fuse these two, and other, modalities to improve the accuracy and robustness of the emotion recognition system. This paper analyzes the strengths and the limitations of systems based only on facial expressions or acoustic information. It also discusses two approaches used to fuse these two modalities: decision level and feature level integration. Using a database recorded from an actress, four emotions were classified: sadness, anger, happiness, and neutral state. By the use of markers on her face, detailed facial motions were captured with motion capture, in conjunction with simultaneous speech recordings. The results reveal that the system based on facial expression gave better performance than the system based on just acoustic information for the emotions considered. Results also show the complementarily of the two modalities and that when these two modalities are fused, the performance and the robustness of the emotion recognition system improve measurably.
科研通智能强力驱动
Strongly Powered by AbleSci AI