协变量
因果推理
估计员
计算机科学
推论
人口
样本量测定
差异(会计)
计量经济学
数据挖掘
机器学习
人工智能
统计
数学
人口学
会计
社会学
业务
作者
Peng Wu,Shanshan Luo,Zhi Geng
标识
DOI:10.1080/01621459.2024.2435656
摘要
There is growing interest in exploring causal effects in target populations via data combination. However, most approaches are tailored to specific settings and lack comprehensive comparative analyses. In this article, we focus on a typical scenario involving a source dataset and a target dataset. We first design six settings under covariate shift and conduct a comparative analysis by deriving the semiparametric efficiency bounds for the ATE in the target population. We then extend this analysis to six new settings that incorporate both covariate shift and posterior drift. Our study uncovers the key factors that influence efficiency gains and the "effective sample size" when combining two datasets, with a particular emphasis on the roles of the variance ratio of potential outcomes between two datasets and the derivatives of the posterior drift function. To the best of our knowledge, this is the first paper that explicitly explores the role of the posterior drift functions in causal inference. Additionally, we also propose novel methods for conducting sensitivity analysis to address violations of transportability between two datasets. We empirically validate our findings by constructing locally efficient estimators and conducting extensive simulations. We demonstrate the proposed methods in two real-world applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI