语音识别
估计
还原(数学)
降噪
计算机科学
噪音(视频)
数学
人工智能
工程类
几何学
系统工程
图像(数学)
作者
Alexander Schasse,Rainer Martin
标识
DOI:10.1109/taslp.2014.2329633
摘要
Recently, it has been proposed to use the minimum-variance distortionless-response (MVDR) approach in single-channel speech enhancement in the short-time frequency domain. By applying optimal FIR filters to each subband signal, these filters reduce additive noise components with less speech distortion compared to conventional approaches. An important ingredient to these filters is the temporal correlation of the speech signals. We derive algorithms to provide a blind estimation of this quantity based on a maximum-likelihood and maximum a-posteriori estimation. To derive proper models for the inter-frame correlation of the speech and noise signals, we investigate their statistics on a large dataset. If the speech correlation is properly estimated, the previously derived subband filters discussed in this work show significantly less speech distortion compared to conventional noise reduction algorithms. Therefore, the focus of the experimental parts of this work lies on the quality and intelligibility of the processed signals. To evaluate the performance of the subband filters in combination with the clean speech inter-frame correlation estimators, we predict the speech quality and intelligibility by objective measures.
科研通智能强力驱动
Strongly Powered by AbleSci AI