计算机科学                        
                
                                
                        
                            卷积神经网络                        
                
                                
                        
                            保险丝(电气)                        
                
                                
                        
                            人工智能                        
                
                                
                        
                            频道(广播)                        
                
                                
                        
                            深度学习                        
                
                                
                        
                            特征(语言学)                        
                
                                
                        
                            语音识别                        
                
                                
                        
                            代表(政治)                        
                
                                
                        
                            比例(比率)                        
                
                                
                        
                            模式识别(心理学)                        
                
                                
                        
                            语言学                        
                
                                
                        
                            电气工程                        
                
                                
                        
                            工程类                        
                
                                
                        
                            计算机网络                        
                
                                
                        
                            哲学                        
                
                                
                        
                            物理                        
                
                                
                        
                            量子力学                        
                
                                
                        
                            政治                        
                
                                
                        
                            法学                        
                
                                
                        
                            政治学                        
                
                        
                    
            作者
            
                Tianqi Wu,Liejun Wang,Zhang Jiang            
         
                    
        
    
            
            标识
            
                                    DOI:10.1007/978-981-99-8067-3_34
                                    
                                
                                 
         
        
                
            摘要
            
            Speech emotion recognition (SER) plays a crucial role in understanding user intent and improving human-computer interaction (HCI). Currently, the most widely used and effective methods are based on deep learning. In the existing research, the temporal information becomes more and more important in SER. Although some advanced deep learning methods can achieve good results, such as convolutional neural networks (CNN) and attention module, they often ignore the temporal information in speech, which can lead to insufficient representation and low classification accuracy. In order to make full use of temporal features, we proposed channel-aware multi-scale temporal convolutional networks (CM-TCN). Firstly, channel-aware temporal convolutional networks (CATCN) is used as the basic structure to extract multi-scale temporal features combining channel information. Then, global feature attention (GFA) captures the global information at different time scales and enhances the important information. Finally, we use the adaptive fusion module (AFM) to establish the overall dependency of different network layers and fuse features. We conduct extensive experiments on six dataset, and the experimental results demonstrate the superior performance of CM-TCN.
         
            
 
                 
                
                    
                    科研通智能强力驱动
Strongly Powered by AbleSci AI