计算机科学
情绪分析
人工智能
代表(政治)
对象(语法)
模式识别(心理学)
方向(向量空间)
自然语言处理
机器学习
政治
政治学
法学
几何学
数学
作者
Jia Li,Tinghuai Ma,Huan Rong,Najla Al-Nabhan
标识
DOI:10.1109/tetc.2022.3231746
摘要
With the development of multimodal sentiment analysis tasks, target-level/aspect-level multimodal sentiment analysis has received more attention, aiming to intelligently judge the sentiment orientation of target words using visual and textual information. Most existing methods mainly rely on combining the whole image and text while ignoring the implicit affective regions in the image. We introduce a novel affective region recognition and fusion network (ARFN) for target-level multimodal sentiment classification, which focuses more on the alignment of multimodal fusion of visual and textual. First, to produce a visual representation with sentiment elements, ARFN employs the Yolov5 algorithm to extract the object region of the image and selects the emotional area according to the strategy. Next, this method learns target-sensitive visual representations and text semantic representations through a multi-head attention mechanism and pre-trained models BERT, respectively. Moreover, ARFN fuses textual and visual representations through a multimodal interaction method to perform target-level multimodal sentiment classification tasks. We achieve state-of-the-art performance on two available multimodal Twitter datasets, and experimental results show the effectiveness of our approach.
科研通智能强力驱动
Strongly Powered by AbleSci AI