MSF-Net: Multi-stage fusion network for emotion recognition from multimodal signals in scalable healthcare

计算机科学可扩展性阶段（地层学）融合人工智能情绪识别网（多面体）语音识别模式识别（心理学）数据库几何学数学语言学生物哲学古生物学

作者

Md. Milon Islam,Fakhri Karray,Ghulam Muhammad

出处

期刊：Information Fusion [Elsevier BV]
日期：2025-02-19 卷期号：119: 103028-103028 被引量：7

标识

DOI：10.1016/j.inffus.2025.103028

摘要

Automatic emotion recognition has attracted significant interest in healthcare, thanks to remarkable developments made recently in smart and innovative technologies. A real-time emotion recognition system allows for continuous monitoring, comprehension, and enhancement of the physical entity’s capacities, along with continuing advice for enhancing quality of life and well-being in the context of personalized healthcare. Multimodal emotion recognition presents a significant challenge in terms of efficiently using the diverse modalities present in the data. In this article, we introduce a Multi-Stage Fusion Network (MSF-Net) for emotion recognition capable of extracting multimodal information and achieving significant performances. We propose utilizing the transformer-based structure to extract deep features from facial expressions. We exploited two visual descriptors, local binary pattern and Oriented FAST and Rotated BRIEF, to retrieve the computer vision-based features from the facial videos. A feature-level fusion network integrates the extraction of features from these modules, directing the output into the triplet attention technique. This module employs a three-branch architecture to compute attention weights to capture cross-dimensional interactions efficiently. The temporal dependencies in physiological signals are modeled by a Bi-directional Gated Recurrent Unit (Bi-GRU) in forward and backward directions at each time step. Lastly, the output feature representations from the triplet attention module and the extracted high-level patterns from Bi-GRU are fused and fed into the classification module to recognize emotion. The extensive experimental evaluations revealed that the proposed MSF-Net outperformed the state-of-the-art approaches on two popular datasets, BioVid Emo DB and MGEED. Finally, we tested the proposed MSF-Net in the Internet of Things environment to facilitate real-world scalable smart healthcare application. • Introduce a multi-stage fusion network to recognize emotion in multimodal context. • Propose an efficient approach to extract visual features and temporal dependencies. • Exploit triple attention to capture key emotional features via three-branch fusion. • Achieve state-of-the-art results on multimodal data and validate in IoT networks.

求助该文献

最长约 10秒，即可获得该文献文件

MSF-Net: Multi-stage fusion network for emotion recognition from multimodal signals in scalable healthcare

今日热心研友