虚假关系
协变量
计算机科学
估计员
背景(考古学)
机器学习
残余物
推论
对比度(视觉)
置信区间
样本量测定
数据挖掘
统计推断
人工智能
统计
算法
数学
生物
古生物学
作者
Xiang Meng,Jonathan Huang
摘要
Abstract Epidemiologists have access to various methods to reduce bias and improve statistical efficiency in effect estimation, from standard multivariable regression to state-of-the-art doubly-robust efficient estimators paired with highly flexible, data-adaptive algorithms (“machine learning”). However, due to numerous assumptions and trade-offs, epidemiologists face practical difficulties in recognizing which method, if any, may be suitable for their specific data and hypotheses. Importantly, relative advantages are necessarily context-specific (data structure, algorithms, model misspecification), limiting the utility of universal guidance. Evaluating performance through real-data-based simulations is useful but out-of-reach for many epidemiologists. We present a user-friendly, offline Shiny app REFINE2 (Realistic Evaluations of Finite sample INference using Efficient Estimators) that enables analysts to input their own data and quickly compare the performance of different algorithms within their data context in estimating a prespecified average treatment effect (ATE). REFINE2 automates plasmode simulation of a plausible target ATE given observed covariates and then examines bias and confidence interval coverage (relative to this target) given user-specified models. We present an extensive case study to illustrate how REFINE2 can be used to guide analyses within epidemiologist’s own data under three typical scenarios: residual confounding; spurious covariates; and mis-specified effect modification. As expected, the apparent best method differed across scenarios and are suboptimal under residual confounding. REFINE2 may help epidemiologists not only chose amongst imperfect models, but also better understand common underappreciated problems, such as finite sample bias using machine learning.
科研通智能强力驱动
Strongly Powered by AbleSci AI