计算机科学
情态动词
融合
人工智能
情绪分析
传感器融合
语言学
材料科学
哲学
高分子化学
作者
Yan Zhuang,Yanru Zhang,Zheng Hu,Xiaoyue Zhang,Jiawen Deng,Fuji Ren
标识
DOI:10.1145/3664647.3681527
摘要
Multimodal Sentiment Analysis (MSA) has witnessed remarkable progress and gained increasing attention in recent decade. However, current MSA methodologies primarily rely on global representations extracted from different modalities, such as the mean of all token representations, to construct sophisticated fusion networks. These approaches often overlook the valuable details present in local representations, which consist of fused representations of consecutive several tokens. Additionally, the integration of multiple local representations, and the fusion of local and global information present significant challenges. To address these limitations, we propose the Global-Local Modal (GLoMo) Fusion framework. It comprises two essential components: (i) modality-specific mixture of experts layers that integrate diverse local representations within each modality, and (ii) a global-guided fusion module that effectively combines global and local representations. The former component leverages specialized expert networks to automatically select and integrate crucial local representations from each modality, while the latter ensures the preservation of global information during the fusion process. We evaluate GLoMo on various datasets, encompassing tasks in multimodal sentiment analysis, multimodal humor detection, and multimodal emotion recognition. Extensive experiments demonstrate that GLoMo outperforms existing state-of-the-art models, validating the effectiveness of our proposed framework. Our code is publicly available at https://github.com/YetZzzzzz/GLoMo.
科研通智能强力驱动
Strongly Powered by AbleSci AI