计算机科学
卷积神经网络
特征提取
人工智能
模式识别(心理学)
边距(机器学习)
特征(语言学)
人工神经网络
频域
代表(政治)
语音识别
计算机视觉
机器学习
哲学
语言学
政治
政治学
法学
作者
An Dang,Toan H. Vu,Jia‐Ching Wang
标识
DOI:10.1109/icce.2018.8326315
摘要
Audio scenes are often composed of a variety of sound events from different sources. Their content exhibits wide variations in both frequency and time domain. Convolutional neural networks (CNNs) provide an effective way to extract spatial information of multidimensional data such as image, audio, and video; they have the ability to learn hierarchical representation from time-frequency features of audio signals. In this paper, we develop a convolutional neural network and employ a multi-scale multi-feature extraction methods for acoustic scene classification. We conduct experiments on the TUT Acoustic Scenes 2016 dataset. Experimental results show that the use of multi-scale multi-feature extraction methods improves significantly the performance of the system. Our proposed approach obtains a high accuracy of 85.9% that outperforms the baseline approach by a large margin of 8.7%.
科研通智能强力驱动
Strongly Powered by AbleSci AI