计算机科学
情绪分析
人工智能
面部表情
自然语言处理
面子(社会学概念)
模态(人机交互)
机器翻译
钥匙(锁)
匹配(统计)
语言学
数学
计算机安全
统计
哲学
作者
Hao Yang,Yanyan Zhao,Bing Qin
标识
DOI:10.18653/v1/2022.emnlp-main.219
摘要
Aspect-level multimodal sentiment analysis, which aims to identify the sentiment of the target aspect from multimodal data, recently has attracted extensive attention in the community of multimedia and natural language processing. Despite the recent success in textual aspect-based sentiment analysis, existing models mainly focused on utilizing the object-level semantic information in the image but ignore explicitly using the visual emotional cues, especially the facial emotions. How to distill visual emotional cues and align them with the textual content remains a key challenge to solve the problem. In this work, we introduce a face-sensitive image-to-emotional-text translation (FITE) method, which focuses on capturing visual sentiment cues through facial expressions and selectively matching and fusing with the target aspect in textual modality. To the best of our knowledge, we are the first that explicitly utilize the emotional information from images in the multimodal aspect-based sentiment analysis task. Experiment results show that our method achieves state-of-the-art results on the Twitter-2015 and Twitter-2017 datasets. The improvement demonstrates the superiority of our model in capturing aspect-level sentiment in multimodal data with facial expressions.
科研通智能强力驱动
Strongly Powered by AbleSci AI