双音学
混叠
球谐函数
稀疏逼近
计算机科学
卷积神经网络
算法
模式识别(心理学)
编码(内存)
频域
语音识别
人工智能
数学
声学
计算机视觉
扬声器
物理
数学分析
欠采样
作者
Shan Gao,Lin Jing,Xihong Wu,Tianshu Qu
标识
DOI:10.1109/taslp.2022.3153266
摘要
The performance of higherorder Ambisonics (HOA) signals obtained using spherical harmonics decomposition method is disturbed by two primary sources of errors, the noise pollution in low-frequency band and the spatial aliasing in high-frequency band. Inspired by the HOA signals upscale method, which is performed using the sparse character of the sound field, this paper propose a sound field decomposition model based on a sparse deep neural network that offers HOA signals with wider frequency bandwidth. We use the frequency domain multi-scale convolutional network to realize the spherical harmonics decomposition, as well as learning the spatial aliasing pattern, based on which the aliasing-free HOA signals can be derived. Besides, we apply a sparse encoding network to cpature the sparse feature of the sound field which will improve the model performance when the sparse condition is satisfied. The experiments results prove that the proposed model can obtain HOA signals with wider frequency range of operation under multiple sources (up to 10 sources) and low reverberant environments ($T_{60}\le$ 400 ms). When the sparsity feature cannot be satisfied ($T_{60} =$ 800 ms), the proposed network model still maintain the same performance as the traditional methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI