计算机科学
人工智能
多数决原则
投票
一致性(知识库)
模式识别(心理学)
噪音(视频)
加权投票
原始数据
集合(抽象数据类型)
数据集
机器学习
特征(语言学)
质量(理念)
集成学习
特征提取
图像(数学)
语言学
哲学
认识论
政治
政治学
法学
程序设计语言
作者
Qifei Li,Yingming Gao,Ya Li
标识
DOI:10.1145/3581783.3612862
摘要
Automatic emotion recognition has a wide range of applications in human-computer interaction. In this paper, we present our work in the Multimodel Emotion Recognition (MER) 2023, which contains three sub-challenges: MER-MULTI, MER-NOISE, and MER-SEMI. We first use a vanilla semi-supervised method to mine high quality samples from the MER-SEMI unlabeled dataset to expand the training set. Specifically, we ensemble three models trained with the official training set by a majority voting method, which is used to select samples with high prediction consistency. The selected samples together with the original training set are further augmented by adding noise. Then, the features of different modalities of expanded dataset are extracted from several pre-trained or fine-tuned models, and they are subsequently used to create different feature combinations to capture more effective emotion representations. Besides, we employ early fusion of different modal features and late fusion of different recognition models to obtain the final prediction. Experimental results show that our proposed method improves the performance over the official baselines by 30.4%, 55.3% and 1.57% for the three sub-challenges and ranks 4, 3, and 5, respectively. The present work sheds light on high-quality data mining and model ensemble by majority voting for multimodal emotion recognition.
科研通智能强力驱动
Strongly Powered by AbleSci AI