建筑
记忆电阻器
语音识别
计算机科学
认知科学
人工智能
心理学
工程类
历史
电气工程
考古
作者
Tianhao Zhao,Yue Zhou,Xiaofang Hu
标识
DOI:10.1142/s0218127424501177
摘要
Speech Emotion Recognition (SER) is a challenging task characterized by the diversity and complexity of emotional expression. Due to its powerful feature extraction capabilities, Transformer Network (TN) demonstrates advantages and potential in SER. However, the limited size of available datasets and the difficulty of decoupling emotional features restrain its performance and present challenges in implementing SER on edge devices. To address these issues, we present a Memristor-based Progressive Hierarchical Conformer Architecture (MPCA) and design a conformer submodule that leverages convolution to mitigate TN’s limitations in SER. We propose attention-based feature decoupling, employing hierarchical extraction to decouple speaker characteristics and retain the relevant components, thereby obtaining reliable emotional features. Furthermore, we propose a reconfigurable circuit implementation scheme for MPCA based on operator multiplexing achieving flexible modules that can be dynamically adjusted based on the resources of edge devices, and the stability of the designed circuit is analyzed by simulation experiments with PSPICE. We show that the suggested MPCA demonstrates state-of-the-art performance in SER while significantly reducing system power consumption, offering a solution for SER implementation on edge devices.
科研通智能强力驱动
Strongly Powered by AbleSci AI