立体声录音
计算机科学
方位角
语音识别
人工神经网络
混响
深层神经网络
人工智能
分类器(UML)
模式识别(心理学)
声学
数学
几何学
物理
作者
Ning Ma,Guy J. Brown,Tobias May
标识
DOI:10.21437/interspeech.2015-665
摘要
This paper presents a novel machine-hearing system that exploits deep neural networks (DNNs) and head movements for binaural localisation of multiple speakers in reverberant conditions.DNNs are used to map binaural features, consisting of the complete cross-correlation function (CCF) and interaural level differences (ILDs), to the source azimuth.Our approach was evaluated using a localisation task in which sources were located in a full 360-degree azimuth range.As a result, frontback confusions often occurred due to the similarity of binaural features in the front and rear hemifields.To address this, a head movement strategy was incorporated in the DNN-based model to help reduce the front-back errors.Our experiments show that, compared to a system based on a Gaussian mixture model (GMM) classifier, the proposed DNN system substantially reduces localisation errors under challenging acoustic scenarios in which multiple speakers and room reverberation are present.
科研通智能强力驱动
Strongly Powered by AbleSci AI