手势                        
                
                                
                        
                            计算机科学                        
                
                                
                        
                            手势识别                        
                
                                
                        
                            语音识别                        
                
                                
                        
                            可用性                        
                
                                
                        
                            面子(社会学概念)                        
                
                                
                        
                            人机交互                        
                
                                
                        
                            人工智能                        
                
                                
                        
                            社会科学                        
                
                                
                        
                            社会学                        
                
                        
                    
            作者
            
                Zisu Li,Liang Chen,Yuntao Wang,Yue Qin,Chun Yu,Yukang Yan,Mingming Fan,Yuanchun Shi            
         
            
    
            
            标识
            
                                    DOI:10.1145/3544548.3581008
                                    
                                
                                 
         
        
                
            摘要
            
            Gestures performed accompanying the voice are essential for voice interaction to convey complementary semantics for interaction purposes such as wake-up state and input modality. In this paper, we investigated voice-accompanying hand-to-face (VAHF) gestures for voice interaction. We targeted on hand-to-face gestures because such gestures relate closely to speech and yield significant acoustic features (e.g., impeding voice propagation). We conducted a user study to explore the design space of VAHF gestures, where we first gathered candidate gestures and then applied a structural analysis to them in different dimensions (e.g., contact position and type), outputting a total of 8 VAHF gestures with good usability and least confusion. To facilitate VAHF gesture recognition, we proposed a novel cross-device sensing method that leverages heterogeneous channels (vocal, ultrasound, and IMU) of data from commodity devices (earbuds, watches, and rings). Our recognition model achieved an accuracy of 97.3% for recognizing 3 gestures and 91.5% for recognizing 8 gestures (excluding the "empty" gesture), proving the high applicability. Quantitative analysis also shed light on the recognition capability of each sensor channel and their different combinations. In the end, we illustrated the feasible use cases and their design principles to demonstrate the applicability of our system in various scenarios.
         
            
 
                 
                
                    
                    科研通智能强力驱动
Strongly Powered by AbleSci AI