计算机科学
音节
语音识别
解码方法
人工神经网络
语言模型
听觉皮层
演讲制作
语音合成
人工智能
心理学
神经科学
电信
作者
Yan Liu,Zehao Zhao,Minpeng Xu,Haiqing Yu,Yanming Zhu,Jie Zhang,Linghao Bu,Xiaoluo Zhang,Junfeng Lu,Yuanning Li,Dong Ming,Jinsong Wu
出处
期刊:Science Advances
[American Association for the Advancement of Science (AAAS)]
日期:2023-06-09
卷期号:9 (23)
被引量:23
标识
DOI:10.1126/sciadv.adh0478
摘要
Recent studies have shown that the feasibility of speech brain-computer interfaces (BCIs) as a clinically valid treatment in helping nontonal language patients with communication disorders restore their speech ability. However, tonal language speech BCI is challenging because additional precise control of laryngeal movements to produce lexical tones is required. Thus, the model should emphasize the features from the tonal-related cortex. Here, we designed a modularized multistream neural network that directly synthesizes tonal language speech from intracranial recordings. The network decoded lexical tones and base syllables independently via parallel streams of neural network modules inspired by neuroscience findings. The speech was synthesized by combining tonal syllable labels with nondiscriminant speech neural activity. Compared to commonly used baseline models, our proposed models achieved higher performance with modest training data and computational costs. These findings raise a potential strategy for approaching tonal language speech restoration.
科研通智能强力驱动
Strongly Powered by AbleSci AI