峰度
语音识别
计算机科学
可理解性(哲学)
元音
噪音(视频)
背景噪声
模式识别(心理学)
人工智能
数学
统计
电信
认识论
图像(数学)
哲学
作者
Guoping Li,Mark E. Lutman,Shouyan Wang,Stefan Bleeck
标识
DOI:10.3109/14992027.2011.625984
摘要
Objective: Established methods for predicting speech recognition in noise require knowledge of clean speech signals, placing limitations on their application. The study evaluates an alternative approach based on characteristics of noisy speech, specifically its sparseness as represented by the statistic kurtosis. Design: Experiments 1 and 2 involved acoustic analysis of vowel-consonant-vowel (VCV) syllables in babble noise, comparing kurtosis, glimpsing areas, and extended speech intelligibility index (ESII) of noisy speech signals with one another and with pre-existing speech recognition scores. Experiment 3 manipulated kurtosis of VCV syllables and investigated effects on speech recognition scores in normal-hearing listeners. Study sample: Pre-existing speech recognition data for Experiments 1 and 2; seven normal-hearing participants for Experiment 3. Results: Experiments 1 and 2 demonstrated that kurtosis calculated in the time-domain from noisy speech is highly correlated (r > 0.98) with established prediction models: glimpsing and ESII. All three measures predicted speech recognition scores well. The final experiment showed a clear monotonic relationship between speech recognition scores and kurtosis. Conclusions: Speech recognition performance in noise is closely related to the sparseness (kurtosis) of the noisy speech signal, at least for the types of speech and noise used here and for listeners with normal hearing.
科研通智能强力驱动
Strongly Powered by AbleSci AI