计算机科学
图形
感知
领域(数学)
感受野
人工智能
深度学习
比例(比率)
光学(聚焦)
机器学习
自然语言处理
理论计算机科学
物理
数学
量子力学
神经科学
纯数学
生物
光学
作者
Wei Jie,Guanyu Hu,Luu Anh Tuan,Xinyu Yang,Wenjing Zhu
标识
DOI:10.1109/icassp49357.2023.10094596
摘要
Emotion recognition in conversations (ERC) has gained more attention, where contextual information modeling and multimodal fusion have been the focus and challenges in recent years. In this paper, we proposed a Multi-Scale Receptive Field Graph model (MSRFG) to tackle the challenges of ERC. Specifically, MSRFG constructs multi-scale perception graphs and learns contextual information via parallel multi-scale receptive field paths. To compensate for the deficiency of temporal information learning by the graph network, MSRFG injects temporal dependencies into the graph network to model the temporal relationships between utterances. Moreover, to achieve the effective fusion of multimodal information, MSRFG converges the multi-scale features of each modality separately and performs the learning of attention weights after the integration of converged features. We carried out experiments on IEMOCAP and MELD datasets to validate the effectiveness of the proposed method, and the results proved the superiority of our model over the existing SOTA methods. The code is available at https://github.com/Janie1996/MSRFG 1 .
科研通智能强力驱动
Strongly Powered by AbleSci AI