工作流程
过程(计算)
计算机科学
忠诚
公司治理
临床决策支持系统
知识管理
数据科学
管理科学
人工智能
过程管理
决策支持系统
工程类
业务
电信
财务
数据库
操作系统
作者
Nikita Mehandru,Brenda Miao,Eduardo Rodriguez Almaraz,Madhumita Sushil,Atul J. Butte,Ahmed M. Alaa
标识
DOI:10.1038/s41746-024-01083-y
摘要
Recent developments in large language models (LLMs) have unlocked opportunities for healthcare, from information synthesis to clinical decision support. These LLMs are not just capable of modeling language, but can also act as intelligent "agents" that interact with stakeholders in open-ended conversations and even influence clinical decision-making. Rather than relying on benchmarks that measure a model's ability to process clinical data or answer standardized test questions, LLM agents can be modeled in high-fidelity simulations of clinical settings and should be assessed for their impact on clinical workflows. These evaluation frameworks, which we refer to as "Artificial Intelligence Structured Clinical Examinations" ("AI-SCE"), can draw from comparable technologies where machines operate with varying degrees of self-governance, such as self-driving cars, in dynamic environments with multiple stakeholders. Developing these robust, real-world clinical evaluations will be crucial towards deploying LLM agents in medical settings.
科研通智能强力驱动
Strongly Powered by AbleSci AI