A study of the evaluation metrics for generative images containing combinational creativity

一致性(知识库) 公制(单位) 排名(信息检索) 组合逻辑 创造力 生成语法 计算机科学 图像(数学) 秩(图论) 人工智能 机器学习 数学 算法 工程类 逻辑门 心理学 组合数学 社会心理学 运营管理
作者
Boheng Wang,Yunhuai Zhu,Liuqing Chen,Jingcheng Liu,Lingyun Sun,Peter R.N. Childs
出处
期刊:Artificial intelligence for engineering design, analysis and manufacturing [Cambridge University Press]
卷期号:37
标识
DOI:10.1017/s0890060423000069
摘要

Abstract In the field of content generation by machine, the state-of-the-art text-to-image model, DALL⋅E, has advanced and diverse capacities for the combinational image generation with specific textual prompts. The images generated by DALL⋅E seem to exhibit an appreciable level of combinational creativity close to that of humans in terms of visualizing a combinational idea. Although there are several common metrics which can be applied to assess the quality of the images generated by generative models, such as IS, FID, GIQA, and CLIP, it is unclear whether these metrics are equally applicable to assessing images containing combinational creativity. In this study, we collected the generated image data from machine (DALL⋅E) and human designers, respectively. The results of group ranking in the Consensual Assessment Technique (CAT) and the Turing Test (TT) were used as the benchmarks to assess the combinational creativity. Considering the metrics’ mathematical principles and different starting points in evaluating image quality, we introduced coincident rate (CR) and average rank variation (ARV) which are two comparable spaces. An experiment to calculate the consistency of group ranking of each metric by comparing the benchmarks then was conducted. By comparing the consistency results of CR and ARV on group ranking, we summarized the applicability of the existing evaluation metrics in assessing generative images containing combinational creativity. In the four metrics, GIQA performed the closest consistency to the CAT and TT. It shows the potential as an automated assessment for images containing combinational creativity, which can be used to evaluate the images containing combinational creativity in the relevant task of design and engineering such as conceptual sketch, digital design image, and prototyping image.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
wys2493完成签到,获得积分10
刚刚
MOMO发布了新的文献求助10
1秒前
1秒前
义气夜云完成签到,获得积分10
1秒前
1秒前
1秒前
GYF应助龙仔采纳,获得10
2秒前
gjx完成签到,获得积分10
2秒前
2秒前
大模型应助忐忑的雁蓉采纳,获得10
3秒前
towanda发布了新的文献求助10
3秒前
3秒前
4秒前
明灯三千完成签到,获得积分10
5秒前
汉堡包应助颜宁采纳,获得10
5秒前
Skyrin完成签到,获得积分0
5秒前
ding应助Kevin采纳,获得10
5秒前
Owen应助zhanghaonan采纳,获得10
6秒前
6秒前
义气夜云发布了新的文献求助10
6秒前
7秒前
7秒前
HJBF666完成签到 ,获得积分10
8秒前
黄任行完成签到,获得积分10
8秒前
蘑菇菇完成签到,获得积分10
8秒前
哆啦发布了新的文献求助10
9秒前
9秒前
YWY应助科研通管家采纳,获得10
10秒前
MOMO完成签到,获得积分10
10秒前
YWY应助科研通管家采纳,获得10
11秒前
所所应助科研通管家采纳,获得10
11秒前
11秒前
11秒前
星辰大海应助科研通管家采纳,获得10
11秒前
bo发布了新的文献求助10
12秒前
牛牛完成签到,获得积分10
12秒前
12秒前
12秒前
Lnn完成签到,获得积分10
12秒前
13秒前
高分求助中
Ideology and Meaning-Making under the Putin Regime 750
Introduction to Industrial/Organizational Psychology 600
Prompt Engineering for Clinicians: Harnessing AI in Everyday Medical Practice 600
Handbook of Luminescence Dating 500
Safety Pharmacology 500
《KNN基无铅压电陶瓷电学性能优化与物理机理研究》 500
Isomerism In Coordination Compounds 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 计算机科学 化学工程 生物化学 物理 内科学 复合材料 催化作用 光电子学 物理化学 电极 细胞生物学 基因 遗传学
热门帖子
关注 科研通微信公众号,转发送积分 6936411
求助须知:如何正确求助?哪些是违规求助? 8622982
关于积分的说明 18289549
捐赠科研通 6364581
什么是DOI,文献DOI怎么找? 3075654
关于科研通互助平台的介绍 2113611
邀请新用户注册赠送积分活动 2053072