计算机科学
模式
模态(人机交互)
适应(眼睛)
模式治疗法
多通道交互
强化学习
多模式学习
人工智能
变压器
代表(政治)
人机交互
自然语言处理
心理学
政治
物理
量子力学
社会学
电压
神经科学
法学
心理治疗师
社会科学
政治学
作者
Xianbing Zhao,Yixin Chen,Wanting Li,Lei Gao,Buzhou Tang
标识
DOI:10.1109/icassp43922.2022.9746536
摘要
Human multimodal sentiment analysis is a challenging task that devotes to extract and integrate information from multiple resources, such as language, acoustic and visual information. Recently, multimodal adaptation gate (MAG), an attachment to transformer-based pre-trained language representation models, such as BERT and XLNet, has shown state-of-the-art performance on multimodal sentiment analysis. MAG only uses a 1-layer network to fuse multimodal information directly, and does not pay attention to relationships among different modalities. In this paper, we propose an extended MAG, called MAG+, to reinforce multimodal fusion. MAG+ contains two modules: multi-layer MAGs with modality reinforcement (M3R) and Adaptive Layer Aggregation (ALA). In the MAG with modality reinforcement of M3R, each modality is reinforced by all other modalities via crossmodal attention at first, and then all modalities are fused via MAG. The ALA module leverages the multimodal representations at low and high levels as the final multimodal representation. Similar to MAG, MAG+ is also attached to BERT and XLNet. Experimental results on two widely used datasets demonstrate the efficacy of our proposed MAG+.
科研通智能强力驱动
Strongly Powered by AbleSci AI