定位关键字
计算机科学
Softmax函数
语音识别
模式识别(心理学)
人工智能
滑动窗口协议
特征向量
特征(语言学)
动态时间归整
内存占用
定位
恒虚警率
人工神经网络
窗口(计算)
语言学
操作系统
哲学
作者
Guoguo Chen,Carolina Parada,Tara N. Sainath
标识
DOI:10.1109/icassp.2015.7178970
摘要
We present a novel approach to query-by-example keyword spotting (KWS) using a long short-term memory (LSTM) recurrent neural network-based feature extractor. In our approach, we represent each keyword using a fixed-length feature vector obtained by running the keyword audio through a word-based LSTM acoustic model. We use the activations prior to the softmax layer of the LSTM as our keyword-vector. At runtime, we detect the keyword by extracting the same feature vector from a sliding window and computing a simple similarity score between this test vector and the keyword vector. With clean speech, we achieve 86% relative false rejection rate reduction at 0.5% false alarm rate when compared to a competitive phoneme posteriorgram with dynamic time warping KWS system, while the reduction in the presence of babble noise is 67%. Our system has a small memory footprint, low computational cost, and high precision, making it suitable for on-device applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI