计算机科学
深度学习
人工智能
学习迁移
卷积神经网络
机器学习
人工神经网络
支持向量机
特征提取
模式识别(心理学)
自编码
作者
Ermal Toto,ML Tlachac,Elke A. Rundensteiner
出处
期刊:Conference on Information and Knowledge Management
日期:2021-10-26
卷期号:: 4145-4154
标识
DOI:10.1145/3459637.3481895
摘要
Depression is a leading cause of disability with tremendous socioeconomic costs. In spite of early detection being crucial to improving prognosis, this mental illness remains largely undiagnosed. Depression classification from voice holds the promise to revolutionize diagnosis by ubiquitously integrating this screening capability into virtual assistants and smartphone technologies. Unfortunately, due to privacy concerns, audio datasets with depression labels have a small number of participants, causing current classification models to suffer from low performance. To tackle this challenge, we introduce Audio-Assisted BERT (AudiBERT), a novel deep learning framework that leverages the multimodal nature of human voice. To alleviate the small data problem, AudiBERT integrates pretrained audio and text representation models for the respective modalities augmented by a dual self-attention mechanism into a deep learning architecture. AudiBERT applied to depression classification consistently achieves promising performance with an increase in F1 scores between 6% and 30% compared to state-of-the-art audio and text models for 15 thematic question datasets. Using answers from medically targeted and general wellness questions, our framework achieves F1 scores of up to 0.92 and 0.86, respectively, demonstrating the feasibility of depression screening from informal dialogue using voice-enabled technologies.
科研通智能强力驱动
Strongly Powered by AbleSci AI