计算机科学
情绪分析
话语
水准点(测量)
人工智能
投票
任务(项目管理)
模式
人工神经网络
自然语言处理
社会科学
管理
大地测量学
社会学
经济
地理
操作系统
作者
Chao He,Yingshan Shen,Nan Zhong,Dongqing Song,Huijuan Hu,Dingju Zhu,Lihua Cai
标识
DOI:10.1109/cscwd57460.2023.10152653
摘要
Deciphering sentiments or emotions in face-to-face human interactions is an inherent capability of human intelligence, and thus a natural goal of artificial intelligence. The proliferation of multimedia data in video sites gives rise to multimodal sentiment analysis in various applications and research fields such as movie and product review, opinion polling, and affective computing. In order to improve the performance of multimodal sentiment analysis task, this paper proposes a novel neural network with multiple stacked attention mechanism (MSAM) on multimodal data containing texts, video, and audio at an utterance level. We conduct experiments using two benchmark datasets, namely CMU Multi-modal Opinion-level Sentiment Intensity (CMU-MOSI) corpus, and CMU Multimodal Opinion Sentiment and Emotion Intensity (CMU-MOSEI) corpus. Compared with a comprehensive set of state-of-the-art baselines, the evaluation results demonstrate the effectiveness of our proposed MSAM network.
科研通智能强力驱动
Strongly Powered by AbleSci AI