歌词
计算机科学
音乐信息检索
心情
集合(抽象数据类型)
特征(语言学)
语音识别
音频信号处理
人工智能
自然语言处理
音频分析器
音频信号
心理学
语言学
音乐剧
语音编码
艺术
哲学
文学类
精神科
视觉艺术
程序设计语言
作者
Xiao Hu,Kahyun Choi,J. Stephen Downie
摘要
This research proposes a framework for music mood classification that uses multiple and complementary information sources, namely, music audio, lyric text, and social tags associated with music pieces. This article presents the framework and a thorough evaluation of each of its components. Experimental results on a large data set of 18 mood categories show that combining lyrics and audio significantly outperformed systems using audio‐only features. Automatic feature selection techniques were further proved to have reduced feature space. In addition, the examination of learning curves shows that the hybrid systems using lyrics and audio needed fewer training samples and shorter audio clips to achieve the same or better classification accuracies than systems using lyrics or audio singularly. Last but not least, performance comparisons reveal the relative importance of audio and lyric features across mood categories.
科研通智能强力驱动
Strongly Powered by AbleSci AI