计算机科学
面子(社会学概念)
分割
人工智能
计算机视觉
可控性
过程(计算)
像素
集合(抽象数据类型)
社会科学
数学
应用数学
社会学
程序设计语言
操作系统
作者
Xiaoxiong Du,Jun Ping Peng,Yiyi Zhou,Jinlu Zhang,S. Y. Chen,Guannan Jiang,Xiaoshuai Sun,Rongrong Ji
标识
DOI:10.1145/3581783.3612067
摘要
Synthesizing vivid human portraits is a research hot spot in image generation with a wide scope of applications. In addition to fidelity, generation controllability is another key factor that has long plagued its development. To address this issue, existing solutions usually adopt either textual or visual conditions for the target face synthesis, e.g., descriptions or segmentation masks, which still cannot fully control the generation due to the intrinsic shortages of each condition. In this paper, we propose to make use of both types of prior information to facilitate controllable face generation. In particular, we hope to produce coarse-grained information about faces based on the segmentation masks, such as face shapes and poses, and the text description is used to render detailed face attributes, e.g., face color, makeup and gender. More importantly, we hope that the generation can be easily controlled via interactively editing both types of information, making face generation more applicable to real-world applications. To accomplish this target, we propose a novel face generation model termed PixelFace+. In PixelFace+, both the text and mask are encoded as pixel-wise priors, based on which the pixel synthesis process is conducted to produce the expected portraits. Meanwhile, the loss objectives are also carefully designed to make sure that the generated faces are semantically aligned with both text and mask inputs. To validate the proposed PixelFace+, we conducted a comprehensive set of experiments on the widely recognized benchmark called MMCelebA. We not only quantitatively compare PixelFace+ with a bunch of newly proposed Text-to-Face(T2F) generation methods, but also give plenty of qualitative analyses. The experimental results demonstrate that PixelFace+ not only outperforms existing generation methods in both image quality and conditional matching but also shows a much superior controllability of face generation. More importantly, PixelFace+ presents a convenient and interactive way of face generation and manipulation via editing the text and mask inputs. Our SOURCE CODE and DEMO are given in our supplementary materials.
科研通智能强力驱动
Strongly Powered by AbleSci AI