Does AI help humans make better decisions? A statistical evaluation framework for experimental and observational studies

观察研究 计算机科学 机器学习 随机试验 人工智能 结果(博弈论) 基线(sea) 决策者 风险分析(工程) 运筹学 统计 医学 数学 海洋学 地质学 数理经济学
作者
Eli Ben‐Michael,D. James Greiner,Melody Huang,Kosuke Imai,Zhichao Jiang,Sooahn Shin
出处
期刊:Proceedings of the National Academy of Sciences of the United States of America [National Academy of Sciences]
卷期号:122 (38)
标识
DOI:10.1073/pnas.2505106122
摘要

The use of AI, or more generally data-driven algorithms, has become ubiquitous in today’s society. Yet, in many cases and especially when stakes are high, humans still make final decisions. The critical question, therefore, is whether AI helps humans make better decisions compared to a human-alone or AI-alone system. We introduce a methodological framework to answer this question empirically with minimal assumptions. We measure a decision maker’s ability to make correct decisions using standard classification metrics based on the baseline potential outcome. We consider a single-blinded and unconfounded treatment assignment, in which the provision of AI-generated recommendations is assumed to be randomized across cases, conditional on observed covariates, with final decisions made by humans. Under this study design, we show how to compare the performance of three alternative decision-making systems—human-alone, human-with-AI, and AI-alone. Importantly, the AI-alone system encompasses any individualized treatment assignment, including those not used in the original study. We also show when AI recommendations should be provided to a human-decision maker, and when one should follow such recommendations. We apply the proposed methodology to our own randomized controlled trial evaluating a pretrial risk assessment instrument. We find that the risk assessment recommendations do not improve the classification accuracy of a judge’s decision to impose cash bail. Furthermore, replacing a human judge with algorithms—the risk assessment score and a large language model in particular—yields worse classification performance.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
1秒前
2秒前
啊花完成签到,获得积分10
2秒前
弄青莲完成签到 ,获得积分10
3秒前
小福发布了新的文献求助10
4秒前
Danke发布了新的文献求助10
4秒前
MTRQ发布了新的文献求助30
4秒前
4秒前
4秒前
Owen应助kydd采纳,获得10
5秒前
711notfound发布了新的文献求助10
5秒前
搜集达人应助kydd采纳,获得10
5秒前
李小刚发布了新的文献求助10
6秒前
小马甲应助renkemaomao采纳,获得10
6秒前
hygge完成签到 ,获得积分10
7秒前
Owen应助YUAN采纳,获得10
7秒前
7秒前
完美世界应助wsy采纳,获得10
8秒前
8秒前
许七安发布了新的文献求助10
8秒前
9秒前
Hello应助ti采纳,获得10
9秒前
洛希完成签到,获得积分10
10秒前
啊花发布了新的文献求助10
10秒前
10秒前
11秒前
完美世界应助赖道之采纳,获得10
11秒前
思源应助英俊的毛豆采纳,获得10
12秒前
量子星尘发布了新的文献求助10
13秒前
无辜小鸭子完成签到,获得积分10
13秒前
yiluxiangbei发布了新的文献求助10
14秒前
Forest完成签到,获得积分10
14秒前
16秒前
16秒前
16秒前
16秒前
fang发布了新的文献求助30
16秒前
阿北完成签到,获得积分10
16秒前
neversay4ever发布了新的文献求助10
16秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Treatise on Geochemistry (Third edition) 1600
Разработка технологических основ обеспечения качества сборки высокоточных узлов газотурбинных двигателей,2000 1000
Vertebrate Palaeontology, 5th Edition 500
ISO/IEC 24760-1:2025 Information security, cybersecurity and privacy protection — A framework for identity management 500
碳捕捉技术能效评价方法 500
Optimization and Learning via Stochastic Gradient Search 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 内科学 生物化学 物理 计算机科学 纳米技术 遗传学 基因 复合材料 化学工程 物理化学 病理 催化作用 免疫学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 4701564
求助须知:如何正确求助?哪些是违规求助? 4069790
关于积分的说明 12583481
捐赠科研通 3769960
什么是DOI,文献DOI怎么找? 2082004
邀请新用户注册赠送积分活动 1109616
科研通“疑难数据库(出版商)”最低求助积分说明 987822