计算机科学
卷积神经网络
代表(政治)
人工智能
对比度(视觉)
样品(材料)
编码
语音识别
模式识别(心理学)
机器学习
自然语言处理
基因
政治
化学
生物化学
法学
色谱法
政治学
作者
Xingchen Ma,Hongyu Yang,Qiang Chen,Di Huang,Yunhong Wang
出处
期刊:ACM Multimedia
日期:2016-10-12
被引量:218
标识
DOI:10.1145/2988257.2988267
摘要
This paper presents a novel and effective audio based method on depression classification. It focuses on two important issues, \emph{i.e.} data representation and sample imbalance, which are not well addressed in literature. For the former one, in contrast to traditional shallow hand-crafted features, we propose a deep model, namely DepAudioNet, to encode the depression related characteristics in the vocal channel, combining Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) to deliver a more comprehensive audio representation. For the latter one, we introduce a random sampling strategy in the model training phase to balance the positive and negative samples, which largely alleviates the bias caused by uneven sample distribution. Evaluations are carried out on the DAIC-WOZ dataset for the Depression Classification Sub-challenge (DCC) at the 2016 Audio-Visual Emotion Challenge (AVEC), and the experimental results achieved clearly demonstrate the effectiveness of the proposed approach.
科研通智能强力驱动
Strongly Powered by AbleSci AI