机制(生物学)
频道(广播)
计算机科学
神经科学
心理学
电信
物理
量子力学
作者
Zichun Hua,Zhigang Lian
出处
期刊:Noise Control Engineering Journal
[Institute of Noise Control Engineering of the USA]
日期:2025-09-06
卷期号:73 (4): 537-552
摘要
Abstract: Speech enhancement techniques aim to extract clean speech from noisy speech signals and to improve the performance of speech communication, recognition, and interaction systems. In particular, in complex noisy environments such as airports, where background noise is diverse and dynamically changing, traditional enhancement methods struggle to address these challenges effectively. Generative Adversarial Networks (GANs) have been widely used for speech enhancement, but SEGAN still lacks robustness in complex non-stationary noise environments. To address this issue, this paper proposes a SE-block Speech Enhancement Generative Adversarial Network (SSEGAN), which enhances the model’s ability to focus on speech-critical signals by introducing a channel attention mechanism. This mechanism automatically learns and assigns weights to each feature channel by applying global average pooling followed by a fully connected network, thereby achieving dynamic attention to speech-critical features in the generator. By enhancing the response to important channels and suppressing redundant or noise-dominated information, the model can more accurately extract the effective components of speech, thereby improving its ability to model speech structures. Experimental results show that SSEGAN outperforms the original SEGAN in terms of signal-to-noise ratio (SNR) improvement, speech quality, and intelligibility. The score of subjective quality assessment is high, and it has achieved a statistically significant advantage in intelligibility, and the reasoning time is reduced. The effectiveness of the channel attention mechanism in complex noise environments is verified. These improvements provide new ideas for the optimization of speech enhancement techniques in practical applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI