计算机科学
韵律
语音识别
身份(音乐)
语音分析
语音合成
质量(理念)
人工智能
自然语言处理
声学
认识论
物理
哲学
作者
Berrak Şişman,Junichi Yamagishi,Simon King,Haizhou Li
出处
期刊:IEEE/ACM transactions on audio, speech, and language processing
[Institute of Electrical and Electronics Engineers]
日期:2021-01-01
卷期号:29: 132-157
被引量:149
标识
DOI:10.1109/taslp.2020.3038524
摘要
Speaker identity is one of the important characteristics of human speech. In voice conversion, we change the speaker identity from one to another, while keeping the linguistic content unchanged. Voice conversion involves multiple speech processing techniques, such as speech analysis, spectral conversion, prosody conversion, speaker characterization, and vocoding. With the recent advances in theory and practice, we are now able to produce human-like voice quality with high speaker similarity. In this article, we provide a comprehensive overview of the state-of-the-art of voice conversion techniques and their performance evaluation methods from the statistical approaches to deep learning, and discuss their promise and limitations. We will also report the recent Voice Conversion Challenges (VCC), the performance of the current state of technology, and provide a summary of the available resources for voice conversion research.
科研通智能强力驱动
Strongly Powered by AbleSci AI