单声道
计算机科学
语音增强
语音识别
感知
人工智能
心理学
降噪
神经科学
作者
Tian Lan,Jiajia Li,Wenxin Tai,Cong Chen,J. KANG,Qiao Liu
标识
DOI:10.1109/icme52920.2022.9858924
摘要
Speech enhancement aims to recover clean speech from complex noise backgrounds. This paper proposes a novel information processing module dubbed dynamic scene perception module (DSPM) that can help existing systems to accommodate various complex scenarios. The inspiration of DSPM is based on the observation that different regions of the noisy spectrum in different scenarios have different enhancing requirements. Concretely, DSPM consists of two parts, one for dynamic scene estimation, and the other for adaptive region perception. In particular, the scene estimator utilizes a spectrum-energy-based attention mechanism to obtain the coefficients of each convolution kernel. Then, at each position' the region perceptron chooses the corresponding kernels by considering the requirements of the current region (preserve vocals or suppress noise). Systematic evaluations on the TIMIT corpus and Voice Bank + DEMAND demonstrate the effectiveness of our method. Compared with the existing systems, our proposed method achieved better performance under various SNR conditions and complex noise scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI