估计员
数学
推论
一致性(知识库)
结果(博弈论)
非参数统计
缺少数据
统计
估计
机器学习
人工智能
计量经济学
计算机科学
几何学
管理
数理经济学
经济
作者
Yuqian Zhang,Jelena Bradić
出处
期刊:Biometrika
[Oxford University Press]
日期:2021-09-09
卷期号:109 (2): 387-403
被引量:18
标识
DOI:10.1093/biomet/asab042
摘要
Summary A fundamental challenge in semi-supervised learning lies in the observed data’s disproportional size when compared with the size of the data collected with missing outcomes. An implicit understanding is that the dataset with missing outcomes, being significantly larger, ought to improve estimation and inference. However, it is unclear to what extent this is correct. We illustrate one clear benefit: root-$n$ inference of the outcome’s mean is possible while only requiring a consistent estimation of the outcome, possibly at a rate slower than root $n$. This is achieved by a novel $k$-fold, cross-fitted, double robust estimator. We discuss both linear and nonlinear outcomes. Such an estimator is particularly suited for models that naturally do not admit root-$n$ consistency, such as high-dimensional, nonparametric or semiparametric models. We apply our methods to estimating heterogeneous treatment effects.
科研通智能强力驱动
Strongly Powered by AbleSci AI