计算机科学
变压器
情态动词
人工智能
自然语言处理
情绪分析
机器学习
情报检索
工程类
电气工程
电压
化学
高分子化学
作者
Zenan Xu,Qinliang Su,Jin‐Xin Xiao
标识
DOI:10.1109/icme55011.2023.00239
摘要
The multimodal aspect-based sentiment classification task (mABSC) aims to recognize the sentiment polarities of aspect entities according to the associated textual and visual resources. Inspired by the cross-modal alignment ability of Transformers, some recent mABSC methods have also proposed to use Transformers to discover relevance between aspect entity and cross-modal visual regions, based on which relevant visual regions can be identified and leveraged to help recognize the sentiment polarity of aspect entities. However, due to the limited training data in mABSC tasks, it is found that in the Transformers, the aspect entities are often attended to irrelevant visual information, which obviously will not benefit the sentiment polarity recognition. To address the issue, we seek help from external knowledge, including textual syntax and cross-modal relevancy knowledge. The basic idea is to cut off the irrelevant connections among textual or cross-modal modalities in the Transformer layer using a knowledge-induced matrix. To have the matrix going beyond simply capturing direct relations, a mechanism is developed to enable it to reflect multi-hop relations, followed by a discretization operation to filter out extreme relevancy. Extensive experiments on two public multimodal datasets show that our method outperforms all competing baselines. Further studies demonstrate the effectiveness of each component, and suggest that the introduced external knowledge can instruct the model to learn the correct relevance among textual or cross-modal features, thus benefiting the mABSC task.
科研通智能强力驱动
Strongly Powered by AbleSci AI