计算机科学
突出
人工智能
情绪分析
语义鸿沟
语义学(计算机科学)
水准点(测量)
自然语言处理
变压器
图像(数学)
模式识别(心理学)
情报检索
图像检索
物理
大地测量学
量子力学
电压
程序设计语言
地理
作者
Jianfei Yu,Kai Chen,Rui Xia
标识
DOI:10.1109/taffc.2022.3171091
摘要
Aspect-based multimodal sentiment analysis (ABMSA) aims to determine the sentiment polarities of each aspect or entity mentioned in a multimodal post or review. Previous studies to ABMSA can be summarized into two subtasks: aspect-term based multimodal sentiment classification (ATMSC) and aspect-category based multimodal sentiment classification (ACMSC). However, these existing studies have three shortcomings: (1) ignoring the object-level semantics in images; (2) primarily focusing on aspect-text and aspect-image interactions; (3) failing to consider the semantic gap between text and image representations. To tackle these issues, we propose a general Hierarchical Interactive Multimodal Transformer (HIMT) model for ABMSA. Specifically, we extract salient features with semantic concepts from images via an object detection method, and then propose a hierarchical interaction module to first model the aspect-text and aspect-image interactions, followed by capturing the text-image interactions. Moreover, an auxiliary reconstruction module is devised to largely eliminate the semantic gap between text and image representations. Experimental results show that our HIMT model significantly outperforms state-of-the-art methods on two benchmarks for ATMSC and one benchmark for ACMSC.
科研通智能强力驱动
Strongly Powered by AbleSci AI