A novel transformer autoencoder for multi-modal emotion recognition with incomplete data

计算机科学 编码器 自编码 人工智能 卷积神经网络 变压器 模式识别(心理学) 语音识别 深度学习 工程类 操作系统 电气工程 电压
作者
Cheng Cheng,Wenzhe Liu,Zhaoxin Fan,Lin Feng,Ziyu Jia
出处
期刊:Neural Networks [Elsevier]
卷期号:172: 106111-106111 被引量:34
标识
DOI:10.1016/j.neunet.2024.106111
摘要

Multi-modal signals have become essential data for emotion recognition since they can represent emotions more comprehensively. However, in real-world environments, it is often impossible to acquire complete data on multi-modal signals, and the problem of missing modalities causes severe performance degradation in emotion recognition. Therefore, this paper represents the first attempt to use a transformer-based architecture, aiming to fill the modality-incomplete data from partially observed data for multi-modal emotion recognition (MER). Concretely, this paper proposes a novel unified model called transformer autoencoder (TAE), comprising a modality-specific hybrid transformer encoder, an inter-modality transformer encoder, and a convolutional decoder. The modality-specific hybrid transformer encoder bridges a convolutional encoder and a transformer encoder, allowing the encoder to learn local and global context information within each particular modality. The inter-modality transformer encoder builds and aligns global cross-modal correlations and models long-range contextual information with different modalities. The convolutional decoder decodes the encoding features to produce more precise recognition. Besides, a regularization term is introduced into the convolutional decoder to force the decoder to fully leverage the complete and incomplete data for emotional recognition of missing data. 96.33%, 95.64%, and 92.69% accuracies are attained on the available data of the DEAP and SEED-IV datasets, and 93.25%, 92.23%, and 81.76% accuracies are obtained on the missing data. Particularly, the model acquires a 5.61% advantage with 70% missing data, demonstrating that the model outperforms some state-of-the-art approaches in incomplete multi-modal learning.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

祝大家在新的一年里科研腾飞
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
稳重的蛟凤应助李亚浩采纳,获得10
1秒前
4秒前
踏实小凝发布了新的文献求助10
4秒前
Angie完成签到,获得积分10
4秒前
飘零枫叶发布了新的文献求助10
5秒前
武雨寒发布了新的文献求助10
7秒前
slx发布了新的文献求助10
8秒前
8秒前
8秒前
橘里完成签到,获得积分10
9秒前
汐白发布了新的文献求助10
9秒前
科研通AI2S应助隐形的从阳采纳,获得10
10秒前
12秒前
852应助刘小文采纳,获得10
12秒前
FYJY发布了新的文献求助10
13秒前
15秒前
17秒前
JamesPei应助王晓风采纳,获得30
17秒前
如愿发布了新的文献求助40
18秒前
zhihe完成签到,获得积分10
18秒前
Wangdx完成签到 ,获得积分10
19秒前
缪缪发布了新的文献求助10
20秒前
22秒前
23秒前
23秒前
26秒前
27秒前
27秒前
zkeeee发布了新的文献求助10
28秒前
Edward完成签到 ,获得积分10
31秒前
32秒前
33秒前
son完成签到,获得积分10
33秒前
我是老大应助开心荷包蛋采纳,获得10
34秒前
34秒前
35秒前
香蕉觅云应助娜娜采纳,获得10
36秒前
核桃发布了新的文献求助10
36秒前
37秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Les Mantodea de guyane 2500
Fare-free public transit service: Experience from Gaoping city of China 1000
Signals, Systems, and Signal Processing 510
Discrete-Time Signals and Systems 510
《The Emergency Nursing High-Yield Guide》 (或简称为 Emergency Nursing High-Yield Essentials) 500
The Dance of Butch/Femme: The Complementarity and Autonomy of Lesbian Gender Identity 500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5879343
求助须知:如何正确求助?哪些是违规求助? 6562340
关于积分的说明 15687297
捐赠科研通 4998930
什么是DOI,文献DOI怎么找? 2693600
邀请新用户注册赠送积分活动 1635546
关于科研通互助平台的介绍 1593031