计算机科学
质量(理念)
食品质量
人工智能
多媒体
食品科学
认识论
哲学
化学
作者
Dongjian Yu,Weiqing Min,Xin Jin,Qian Jiang,Ying Jin,Shuqiang Jiang
摘要
Food image generation holds promising application prospects in food design, advertising, and food education. However, the existing methods rely on information such as recipes, ingredients, or food names, which leads to generated food images with less intra-class diversity. When recipes, ingredients and food names are identical for the same food, the real-world images may vary significantly in appearance. The question of how to simultaneously ensure the quality and diversity of the generated images is a key issue. To this end, we employ pre-trained diffusion model and Transformer to propose a method for generating diverse and high-quality images of both Chinese and Western food, named CW-Food. Different from previous works that utilize an overall food feature to generate new images, CW-Food first decouples the food images to obtain common intra-class features and private instance features. Additionally, we design a Transformer-based feature fusion module to integrate the common and private features, in order to avoid the shortcomings of conventional methods. Moreover, we also utilize a pre-trained diffusion model as our backbone, which is fine-tuned using LoRA with the fused multi-variate features. Extensive experiments on four datasets demonstrate the advantages of our proposed method, producing diverse and high-quality food images encompassing both Chinese and Western cuisines. To the best of our knowledge, our work is the first attempt to generate Chinese food images using only food names.
科研通智能强力驱动
Strongly Powered by AbleSci AI