光谱图
模式
计算机科学
人工智能
领域
痴呆
可视化
图像融合
多模态
模式识别(心理学)
语音识别
疾病
图像(数学)
医学
社会科学
病理
社会学
万维网
政治学
法学
作者
Ivan Krstev,Milan Pavikjevikj,Martina Toshevska,Sonja Gievska
标识
DOI:10.1007/978-3-031-06018-2_6
摘要
The viability of multimodal fusion of linguistic and acoustic biomarkers in speech to help in identifying a person with probable Alzheimer’s dementia symptoms have been explored in this research. For capturing the effect of dementia on person’s language and verbal abilities, a novel way of disease detection was explored based on visual analysis of images of spectrogram extracted from patient’s interview recordings. We put forward three fusion methods, which allow the major advancements in representation learning to be utilized. The objective of the empirical study and ensuing discussion presented in this paper was threefold: 1) to examine the potential of state-of-the-art transformer-based architectures and transfer learning to assist the disease diagnosis, 2) to map the problem of acoustic analysis into the realm of image processing, by transforming spectrograms into images and employing pretrained deep neural networks, such as ResNet to extract visual patterns, and 3) to investigate the sound interplay of multi-modal biomarkers of Alzheimer’s dementia when fusing the learned representations in different modalities. We present the results of independent evaluations of the unimodal methods against which the fusion methods have been compared to.
科研通智能强力驱动
Strongly Powered by AbleSci AI