Transfer Learning for Low-Resource, Multi-Lingual, and Zero-Shot Multi-Speaker Text-to-Speech

计算机科学 自然性 语音识别 学习迁移 人工智能 可理解性(哲学) 自然语言处理 说话人识别 特征(语言学) 语言学 哲学 物理 认识论 量子力学
作者
Myeonghun Jeong,Minchan Kim,Byoung Jin Choi,Jaesam Yoon,Won Jang,Nam Soo Kim
出处
期刊:IEEE/ACM transactions on audio, speech, and language processing [Institute of Electrical and Electronics Engineers]
卷期号:32: 1519-1530 被引量:1
标识
DOI:10.1109/taslp.2024.3364085
摘要

Though neural text-to-speech (TTS) models show remarkable performance, they still require a large amount of < speech, text> paired dataset, which is expensive to collect. The heavy demand for collecting paired datasets makes the TTS models support only a small number of speakers and languages. To address this problem, we introduce a transfer learning framework for multi-lingual, zero-shot multi-speaker, and low-resource TTS. Firstly, we pretrain our model in an unsupervised manner with a multi-lingual multi-speaker speech-only dataset by leveraging the self-supervised speech representations as intermediate linguistic representations. Given this pretrained linguistic information, we then apply a supervised learning technique to the TTS model with a small amount of paired dataset. The pretrained linguistic representations extracted from the large-scale speech-only dataset facilitate phoneme-to-linguistic feature matching, which provides good guidance for supervised learning with a limited amount of labeled data. We evaluate the performance of our proposed model in low-resource, multi-lingual, and zero-shot multi-speaker TTS tasks. The experimental results demonstrate that our proposed method outperforms the baseline in terms of naturalness, intelligibility, and speaker similarity.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
赵狗儿完成签到,获得积分10
刚刚
小熊发布了新的文献求助10
1秒前
单纯行天完成签到 ,获得积分10
4秒前
4秒前
司马惜儿完成签到,获得积分10
6秒前
书霂完成签到,获得积分10
6秒前
司马惜儿发布了新的文献求助10
8秒前
glj发布了新的文献求助10
8秒前
8秒前
糟糕的花卷完成签到,获得积分10
10秒前
个性凡儿发布了新的文献求助10
12秒前
坚定惊蛰完成签到,获得积分10
12秒前
Maydalian完成签到,获得积分10
13秒前
15秒前
爆米花应助元炫采纳,获得10
16秒前
fengchen完成签到,获得积分20
16秒前
LAIII完成签到,获得积分10
17秒前
ttt完成签到,获得积分10
17秒前
爆米花应助潮哈哈耶采纳,获得10
21秒前
丘比特应助qizhixu采纳,获得10
21秒前
25秒前
司空沛槐发布了新的文献求助50
26秒前
香蕉觅云应助mao采纳,获得10
27秒前
28秒前
Auston_zhong应助sad采纳,获得10
28秒前
大力的向日葵完成签到,获得积分10
28秒前
believe发布了新的文献求助10
29秒前
29秒前
wanci应助镜羽采纳,获得10
30秒前
34秒前
kytm完成签到,获得积分10
34秒前
无聊的老姆完成签到 ,获得积分10
35秒前
35秒前
很美味发布了新的文献求助10
36秒前
田様应助芋泥夹心采纳,获得10
37秒前
37秒前
lululu0212完成签到,获得积分10
38秒前
39秒前
mao发布了新的文献求助10
39秒前
敏感凝竹完成签到,获得积分10
40秒前
高分求助中
Les Mantodea de Guyane Insecta, Polyneoptera 2500
Mobilization, center-periphery structures and nation-building 600
Introduction to Strong Mixing Conditions Volumes 1-3 500
Technologies supporting mass customization of apparel: A pilot project 450
China—Art—Modernity: A Critical Introduction to Chinese Visual Expression from the Beginning of the Twentieth Century to the Present Day 430
Multichannel rotary joints-How they work 400
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3794649
求助须知:如何正确求助?哪些是违规求助? 3339446
关于积分的说明 10296040
捐赠科研通 3056142
什么是DOI,文献DOI怎么找? 1676904
邀请新用户注册赠送积分活动 804932
科研通“疑难数据库(出版商)”最低求助积分说明 762216