计算机科学
循环神经网络
人工智能
面部表情
变压器
卷积神经网络
感知
水准点(测量)
深度学习
期限(时间)
语音识别
模式识别(心理学)
机器学习
人工神经网络
物理
大地测量学
量子力学
电压
神经科学
生物
地理
作者
Charles Lu,Yue Jiang,Keren Fu,Qijun Zhao,Hongyu Yang
标识
DOI:10.1007/978-981-99-8469-5_14
摘要
Dynamic facial expression recognition (DFER) presents a difficult challenge, and antecedent methodologies leveraging convolutional neural networks (CNNs), recurrent neural networks (RNNs), or Transformers focus on extracting either long-term temporal information or short-term temporal information from facial videos. Unlike prevailing approaches, we design a novel framework named long short-term perception network (LSTPNet). It can easily perceive aforementioned dual temporal cues and bestow notable advantages upon the DFER task. To be specific, a temporal channel excitation (TCE) module is proposed, building upon the previous outstanding efficient channel attention (ECA) module. This extension serves to imbue the backbone network with temporal attention capabilities, thereby facilitating the acquisition of more enriched temporal features. Furthermore, we design a long short-term temporal Transformer (LSTformer) which can capture both short-term and long-term temporal information with efficacy. The empirical findings, as showcased across three benchmark datasets, unequivocally demonstrate the state-of-the-art performance of LSTPNet.
科研通智能强力驱动
Strongly Powered by AbleSci AI