Enhancing the reliability and accuracy of AI-enabled diagnosis via complementarity-driven deferral to clinicians (CoDoC)

工作流程 人工智能 计算机科学 机器学习 假阳性悖论 认证 工作量 假阳性和假阴性 诊断准确性 医学 医学物理学 放射科 政治学 数据库 操作系统 法学
作者
Krishnamurthy Dvijotham,Jim Winkens,Melih Barsbey,Sumedh Ghaisas,Nick Pawlowski,Robert Stanforth,Patricia MacWilliams,Zahra S. Ahmed,Shekoofeh Azizi,Yoram Bachrach,Laura Culp,Mayank Daswani,Jan Freyberg,Christopher Kelly,Atilla P. Kiraly,Scott McKinney,Basil Mustafa,Vivek Natarajan,Krzysztof J. Geras,Jan Witowski
出处
期刊:Research Square - Research Square 被引量:2
标识
DOI:10.21203/rs.3.rs-2231672/v1
摘要

Abstract Diagnostic AI systems trained using deep learning have been shown to achieve expert-level identification of diseases in multiple medical imaging settings 1,2 . However, such systems are not always reliable and can fail in cases diagnosed accurately by clinicians and vice versa 3 . Mechanisms for leveraging this complementarity by learning to select optimally between discordant decisions of AIs and clinicians have remained largely unexplored in healthcare 4 , yet have the potential to achieve levels of performance that exceed that possible from either AI or clinician alone 4 . We develop a Complementarity-driven Deferral-to-Clinical Workflow (CoDoC) system that can learn to decide when to rely on a diagnostic AI model and when to defer to a clinician or their workflow. We show that our system is compatible with diagnostic AI models from multiple manufacturers, obtaining enhanced accuracy (sensitivity and/or specificity) relative to clinician-only or AI-only baselines in clinical workflows that screen for breast cancer or tuberculosis. For breast cancer, we demonstrate the first system that exceeds the accuracy of double-reading with arbitration (the “gold standard” of care) in a large representative UK screening program, with 25% reduction in false positives despite equivalent true-positive detection, while achieving a 66% reduction in clinical workload. In two separate US datasets, CoDoC exceeds the accuracy of single-reading by board certified radiologists and two different standalone state-of-the-art AI systems, with generalisation of this finding in different diagnostic AI manufacturers. For TB screening with chest X-rays, CoDoC improved specificity (while maintaining sensitivity) compared to standalone AI or clinicians for 3 of 5 commercially available diagnostic AI systems (5–15% reduction in false positives). Further, we show the limits of confidence score based deferral systems for medical AI, by demonstrating that no deferral strategy could have achieved significant improvement on the remaining two diagnostic AI systems. Our comprehensive assessment demonstrates that the superiority of CoDoC is sustained in multiple realistic stress tests for generalisation of medical AI tools along four axes: variation in the medical imaging modality; variation in clinical settings and human experts; different clinical deferral pathways within a given modality; and different AI softwares. Further, given the simplicity of CoDoC we believe that practitioners can easily adapt it and we provide an open-source implementation to encourage widespread further research and application.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
上官若男应助伊伊采纳,获得10
4秒前
4秒前
shuang完成签到,获得积分20
7秒前
ll完成签到 ,获得积分20
11秒前
VitoLi完成签到,获得积分10
12秒前
大胖小子完成签到,获得积分10
12秒前
13秒前
周周南完成签到 ,获得积分10
13秒前
13秒前
默存完成签到,获得积分10
15秒前
打打应助美好的千愁采纳,获得10
15秒前
吴子冰发布了新的文献求助10
17秒前
18秒前
YN完成签到,获得积分10
19秒前
饱满以松完成签到 ,获得积分10
23秒前
量子星尘发布了新的文献求助10
25秒前
25秒前
万能图书馆应助fiell采纳,获得100
26秒前
wo完成签到 ,获得积分10
27秒前
27秒前
小马甲应助吴子冰采纳,获得10
30秒前
5641发布了新的文献求助10
30秒前
深年完成签到,获得积分10
33秒前
在水一方应助激动的一手采纳,获得10
34秒前
在水一方应助纪祥龙采纳,获得10
34秒前
huntme完成签到,获得积分10
35秒前
35秒前
36秒前
tennisgirl完成签到 ,获得积分10
36秒前
37秒前
37秒前
桐桐应助5641采纳,获得30
37秒前
40秒前
zouyangmingjia完成签到,获得积分10
40秒前
42秒前
44秒前
HUSHIYI发布了新的文献求助10
45秒前
46秒前
50秒前
51秒前
高分求助中
【提示信息,请勿应助】请使用合适的网盘上传文件 10000
The Oxford Encyclopedia of the History of Modern Psychology 1500
Green Star Japan: Esperanto and the International Language Question, 1880–1945 800
Sentimental Republic: Chinese Intellectuals and the Maoist Past 800
The Martian climate revisited: atmosphere and environment of a desert planet 800
Parametric Random Vibration 800
Building Quantum Computers 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3864497
求助须知:如何正确求助?哪些是违规求助? 3406903
关于积分的说明 10651703
捐赠科研通 3130813
什么是DOI,文献DOI怎么找? 1726640
邀请新用户注册赠送积分活动 831917
科研通“疑难数据库(出版商)”最低求助积分说明 780051