图像合成
计算机科学
对话框
生成语法
图像(数学)
任务(项目管理)
人工智能
领域(数学)
自然语言处理
生成模型
情报检索
数学
管理
万维网
纯数学
经济
作者
Rui Zhou,Cong Jiang,Qingyang Xu
出处
期刊:Neurocomputing
[Elsevier BV]
日期:2021-09-01
卷期号:451: 316-336
被引量:23
标识
DOI:10.1016/j.neucom.2021.04.069
摘要
The task of text-to-image synthesis is a new challenge in the field of image synthesis. In the earlier research, the task of text-to-image synthesis is mainly to achieve the alignment of words and images by the way of retrieval based on the sentences or keywords. With the development of deep learning, especially the application of deep generative models in image synthesis, image synthesis achieves promising progress. The Generative adversarial networks (GANs) are one of the most significant generative models, and GANs have been successfully applied in computer vision, natural language processing and so on. In this paper, we review and summarize the recent research in GANs-based text-to-image synthesis, and provide a summary of the development of classic and advanced models. The input of the GANs-based text-to-image synthesis is not only the general text description as earlier studies, also includes scene layout and dialog text. The typical structure of each categories is elaborated. The general text-based image synthesis is the most commonly in the text-to-image synthesis, and it is subdivided into three groups based on the improvements of text information utilization, network structure and output control conditions. Through the survey, the detailed and logical overview of the evolution of GANs-based text-to-image synthesis is presented. Finally, the challenged problems and the future development of text-to-image synthesis are discussed.
科研通智能强力驱动
Strongly Powered by AbleSci AI