计算机科学
卷积神经网络
人工智能
图形
融合
模式识别(心理学)
机器学习
理论计算机科学
哲学
语言学
作者
Yan Xing,Yanan Yang,Min Hu,Daolun Li,Wenshu Zha,Henry Han
摘要
ABSTRACT Depression is a debilitating mental disorder affecting more than 350 million individuals globally. Therefore, it is imperative to develop an efficient automated depression detection model to aid in clinical diagnosis. Nonetheless, current approaches fail to exploit the connections between data from various modalities fully. To address this issue, we introduce a multimodal fusion model that receives textual, audio, and visual data as input for depression detection. Specifically, BERT and Bi‐LSTM are employed for extracting textual features, and Bi‐LSTM is applied to capture audio and visual characteristics. Subsequently, a deep graph convolutional neural network is employed to amalgamate features from all three modalities effectively. Our model can extract depression‐related information across multiple modalities and effectively fuse this data. Our experiments on the DAIC‐WOZ dataset yielded an F1‐score of 96.30%, surpassing the performance of other advanced techniques, thus demonstrating the effectiveness of our proposed approach.
科研通智能强力驱动
Strongly Powered by AbleSci AI