变化(天文学)
面部表情识别
面部表情
计算机科学
人工智能
语音识别
模式识别(心理学)
面部识别系统
物理
天体物理学
作者
Bei Pan,Kaoru Hirota,Yaping Dai,Zhiyang Jia,Shuai Shao,Jinhua She
标识
DOI:10.1109/tnnls.2025.3548669
摘要
A multiscale sequence information fusion (MSSIF) method is presented for dynamic facial expression recognition (DFER) in video sequences. It exploits multiscale information by integrating features from individual frames, subsequences, and entire sequences through a transformer-based architecture. This hierarchical feature fusion process includes deep feature extraction at the frame level to capture intricate visual details, intrasubsequence fusion using self-attention mechanisms for analyzing adjacent frames, and intersubsequence fusion to synthesize long-term emotional dynamics across time scales. The efficacy of MSSIF is demonstrated through extensive evaluation on three video datasets: eNTERFACE'05, BAUM-1s, and AFEW, where it achieves overall recognition accuracies of 60.1%, 60.7%, and 58.8%, respectively. These results substantiate MSSIF's superior performance in accurately recognizing facial expressions by managing short and long-term dependencies within video sequences, making it a potent tool for real-world applications requiring nuanced dynamic facial expression detection.
科研通智能强力驱动
Strongly Powered by AbleSci AI