计算机科学
情绪分析
判别式
模态(人机交互)
模式
人工智能
利用
自然语言处理
光学(聚焦)
词(群论)
情报检索
模式识别(心理学)
机器学习
语言学
社会科学
哲学
物理
计算机安全
社会学
光学
作者
Feiran Huang,Kaimin Wei,Jian Weng,Zhoujun Li
摘要
Sentiment analysis of social multimedia data has attracted extensive research interest and has been applied to many tasks, such as election prediction and products evaluation. Sentiment analysis of one modality (e.g., text or image) has been broadly studied. However, not much attention has been paid to the sentiment analysis of multimodal data. Different modalities usually have information that is complementary. Thus, it is necessary to learn the overall sentiment by combining the visual content with text description. In this article, we propose a novel method—Attention-Based Modality-Gated Networks (AMGN)—to exploit the correlation between the modalities of images and texts and extract the discriminative features for multimodal sentiment analysis. Specifically, a visual-semantic attention model is proposed to learn attended visual features for each word. To effectively combine the sentiment information on the two modalities of image and text, a modality-gated LSTM is proposed to learn the multimodal features by adaptively selecting the modality that presents stronger sentiment information. Then a semantic self-attention model is proposed to automatically focus on the discriminative features for sentiment classification. Extensive experiments have been conducted on both manually annotated and machine weakly labeled datasets. The results demonstrate the superiority of our approach through comparison with state-of-the-art models.
科研通智能强力驱动
Strongly Powered by AbleSci AI