Is AI ground truth really true? The dangers of training and evaluating AI tools based on experts’ know-what
\n

基本事实 计算机科学 质量(理念) 领域(数学) 工作(物理) 共同点 需要知道 培训(气象学) 知识管理 心理学 数据科学 人工智能 工程类 数学 认识论 社会心理学 计算机安全 纯数学 气象学 哲学 物理 机械工程
作者
Sarah Lebovitz,Natalia Levina,Hila Lifshitz-Assa
出处
期刊:University of Warwick - Warwick Research Archive Portal 被引量:262
标识
DOI:10.25300/misq/2021/16564
摘要

Organizational decision-makers need to evaluate AI tools in light of increasing claims that such tools outperform human experts. Yet, measuring the quality of knowledge work is challenging, raising the question of how to evaluate AI performance in such contexts. We investigate this question through a field study of a major U.S. hospital, observing how managers evaluated five different machine-learning (ML) based AI tools. Each tool reported high performance according to standard AI accuracy measures, which were based on ground truth labels provided by qualified experts. Trying these tools out in practice, however, revealed that none of them met expectations. Searching for explanations, managers began confronting the high uncertainty of experts’ know-what knowledge captured in ground truth labels used to train and validate ML models. In practice, experts address this uncertainty by drawing on rich know-how practices, which were not incorporated into these ML-based tools. Discovering the disconnect between AI’s know-what and experts’ know-how enabled managers to better understand the risks and benefits of each tool. This study shows dangers of treating ground truth labels used in ML models objectively when the underlying knowledge is uncertain. We outline implications of our study for developing, training, and evaluating AI for knowledge work. \n
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
淡淡之玉完成签到,获得积分20
刚刚
2秒前
山茶完成签到 ,获得积分20
2秒前
3秒前
乐乐应助刘碰蛋采纳,获得10
3秒前
科研通AI2S应助Unstoppable采纳,获得10
4秒前
量子星尘发布了新的文献求助10
4秒前
Una发布了新的文献求助10
5秒前
bkagyin应助lalalalala采纳,获得10
6秒前
所所应助洛丶采纳,获得10
6秒前
6秒前
HH发布了新的文献求助10
7秒前
量子星尘发布了新的文献求助10
8秒前
初夏发布了新的文献求助10
9秒前
传奇3应助哈哈哈采纳,获得10
10秒前
梦游完成签到,获得积分20
10秒前
某某完成签到 ,获得积分10
10秒前
今后应助Dc采纳,获得30
12秒前
Kaaaly完成签到,获得积分10
12秒前
顾矜应助XY星雨XY采纳,获得10
13秒前
13秒前
14秒前
英俊的铭应助嗯嗯采纳,获得10
15秒前
出其东门发布了新的文献求助20
15秒前
CodeCraft应助科研通管家采纳,获得10
15秒前
CodeCraft应助科研通管家采纳,获得30
16秒前
学术大拿应助科研通管家采纳,获得10
16秒前
tuanheqi应助科研通管家采纳,获得150
16秒前
Aaron567应助科研通管家采纳,获得20
16秒前
16秒前
汉堡包应助科研通管家采纳,获得10
16秒前
大模型应助科研通管家采纳,获得10
16秒前
科研通AI2S应助科研通管家采纳,获得10
17秒前
17秒前
刘碰蛋发布了新的文献求助10
17秒前
Fancy应助科研通管家采纳,获得10
17秒前
上官若男应助科研通管家采纳,获得10
17秒前
CodeCraft应助科研通管家采纳,获得10
17秒前
完美世界应助科研通管家采纳,获得10
17秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Encyclopedia of Forensic and Legal Medicine Third Edition 5000
Introduction to strong mixing conditions volume 1-3 5000
Aerospace Engineering Education During the First Century of Flight 3000
Agyptische Geschichte der 21.30. Dynastie 3000
Les Mantodea de guyane 2000
Electron Energy Loss Spectroscopy 1500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5778959
求助须知:如何正确求助?哪些是违规求助? 5644592
关于积分的说明 15450766
捐赠科研通 4910444
什么是DOI,文献DOI怎么找? 2642671
邀请新用户注册赠送积分活动 1590372
关于科研通互助平台的介绍 1544741