计算机科学
深度学习
音频信号处理
语音识别
人工智能
卷积神经网络
语音处理
音频信号
钥匙(锁)
信号处理
模式识别(心理学)
语音编码
数字信号处理
计算机硬件
计算机安全
作者
H.‐G. Purwins,Bo Li,Tuomas Virtanen,Jan Schlüter,Shuo-Yiin Chang,Tara N. Sainath
出处
期刊:IEEE Journal of Selected Topics in Signal Processing
[Institute of Electrical and Electronics Engineers]
日期:2019-04-01
卷期号:13 (2): 206-219
被引量:755
标识
DOI:10.1109/jstsp.2019.2908700
摘要
Given the recent surge in developments of deep learning, this article\nprovides a review of the state-of-the-art deep learning techniques for audio\nsignal processing. Speech, music, and environmental sound processing are\nconsidered side-by-side, in order to point out similarities and differences\nbetween the domains, highlighting general methods, problems, key references,\nand potential for cross-fertilization between areas. The dominant feature\nrepresentations (in particular, log-mel spectra and raw waveform) and deep\nlearning models are reviewed, including convolutional neural networks, variants\nof the long short-term memory architecture, as well as more audio-specific\nneural network models. Subsequently, prominent deep learning application areas\nare covered, i.e. audio recognition (automatic speech recognition, music\ninformation retrieval, environmental sound detection, localization and\ntracking) and synthesis and transformation (source separation, audio\nenhancement, generative models for speech, sound, and music synthesis).\nFinally, key issues and future questions regarding deep learning applied to\naudio signal processing are identified.\n
科研通智能强力驱动
Strongly Powered by AbleSci AI