过度拟合
数据科学
计算机科学
工作流程
健康档案
稳健性(进化)
样品(材料)
变量(数学)
软件部署
样本量测定
数据挖掘
风险分析(工程)
医疗保健
人工智能
医学
数据库
统计
软件工程
基因
数学分析
经济
生物化学
人工神经网络
化学
经济增长
色谱法
数学
作者
Christopher Martin Sauer,Li-Ching Chen,Stephanie L Hyland,Armand R.J. Girbes,Paul Elbers,Leo Anthony Celi
标识
DOI:10.1016/s2589-7500(22)00154-6
摘要
Analysis of electronic health records (EHRs) is an increasingly common approach for studying real-world patient data. Use of routinely collected data offers several advantages compared with other study designs, including reduced administrative costs, the ability to update analysis as practice patterns evolve, and larger sample sizes. Methodologically, EHR analysis is subject to distinct challenges because data are not collected for research purposes. In this Viewpoint, we elaborate on the importance of in-depth knowledge of clinical workflows and describe six potential pitfalls to be avoided when working with EHR data, drawing on examples from the literature and our experience. We propose solutions for prevention or mitigation of factors associated with each of these six pitfalls-sample selection bias, imprecise variable definitions, limitations to deployment, variable measurement frequency, subjective treatment allocation, and model overfitting. Ultimately, we hope that this Viewpoint will guide researchers to further improve the methodological robustness of EHR analysis.
科研通智能强力驱动
Strongly Powered by AbleSci AI