计算机科学
模式
深度学习
人工智能
萧条(经济学)
机器学习
机制(生物学)
数据科学
社会科学
认识论
哲学
宏观经济学
社会学
经济
作者
Ermal Toto,ML Tlachac,Elke A. Rundensteiner
出处
期刊:Conference on Information and Knowledge Management
日期:2021-10-26
被引量:30
标识
DOI:10.1145/3459637.3481895
摘要
Depression is a leading cause of disability with tremendous socioeconomic costs. In spite of early detection being crucial to improving prognosis, this mental illness remains largely undiagnosed. Depression classification from voice holds the promise to revolutionize diagnosis by ubiquitously integrating this screening capability into virtual assistants and smartphone technologies. Unfortunately, due to privacy concerns, audio datasets with depression labels have a small number of participants, causing current classification models to suffer from low performance. To tackle this challenge, we introduce Audio-Assisted BERT (AudiBERT), a novel deep learning framework that leverages the multimodal nature of human voice. To alleviate the small data problem, AudiBERT integrates pretrained audio and text representation models for the respective modalities augmented by a dual self-attention mechanism into a deep learning architecture. AudiBERT applied to depression classification consistently achieves promising performance with an increase in F1 scores between 6% and 30% compared to state-of-the-art audio and text models for 15 thematic question datasets. Using answers from medically targeted and general wellness questions, our framework achieves F1 scores of up to 0.92 and 0.86, respectively, demonstrating the feasibility of depression screening from informal dialogue using voice-enabled technologies.
科研通智能强力驱动
Strongly Powered by AbleSci AI