The Detection of Parkinson's Disease From Speech Using Voice Source Information

计算机科学语音识别韵律发声特征提取分类器（UML）人工智能模式识别（心理学）医学听力学

作者

N. P. Narendra,Björn W. Schuller,Paavo Alku

出处

期刊：IEEE/ACM transactions on audio, speech, and language processing [Institute of Electrical and Electronics Engineers]
日期：2021-01-01 卷期号：29: 1925-1936 被引量：79

链接

nbn-resolving.org uni-augsburg.dedoi.org

标识

DOI：10.1109/taslp.2021.3078364

摘要

Developing automatic methods to detect Parkinson's disease (PD) from speech has attracted increasing interest as these techniques can potentially be used in telemonitoring health applications. This article studies the utilization of voice source information in the detection of PD using two classifier architectures: traditional pipeline approach and end-to-end approach. The former consists of feature extraction and classifier stages. In feature extraction, the baseline acoustic features-consisting of articulation, phonation, and prosody features-were computed and voice source information was extracted using glottal features that were estimated by iterative adaptive inverse filtering (IAIF) and quasi-closed phase (QCP) glottal inverse filtering methods. Support vector machine classifiers were developed utilizing the baseline and glottal features extracted from every speech utterance and the corresponding healthy/PD labels. The end-to-end approach uses deep learning models which were trained using both raw speech waveforms and raw voice source waveforms. In the latter, two glottal inverse filtering methods (IAIF and QCP) and zero frequency filtering method were utilized. The deep learning architecture consists of a combination of convolutional layers followed by a multilayer perceptron. Experiments were performed using PC-GITA speech database. From the traditional pipeline systems, the highest classification accuracy (67.93%) was given by combination of baseline and QCP-based glottal features. From the end-to-end-systems, the highest accuracy (68.56%) was given by the system trained using QCP-based glottal flow signals. Even though classification accuracies were modest for all systems, the study is encouraging as the extraction of voice source information was found to be most effective in both approaches.

求助该文献

最长约 10秒，即可获得该文献文件

The Detection of Parkinson's Disease From Speech Using Voice Source Information

今日热心研友