筛子(范畴论)
估计员
维数(图论)
数学
差异(会计)
数学优化
观察研究
计算机科学
计量经济学
统计
经济
离散数学
会计
纯数学
作者
Yue Fang,Xi Jin,Haitian Xie
标识
DOI:10.1080/07350015.2024.2442672
摘要
This study investigates the policy learning problem in observational studies, where the treatment variable can be multivalued and the propensity scores are unknown. We approximate the optimal policy in a global policy class with infinite complexity (VC/Natarajan) dimension, using a sequence of sieve policy classes with finite complexity dimension. The optimal policy within each sieve class is estimated by maximizing the empirical welfare, constructed through the doubly robust moment condition and cross-fitting method. To select the suitable sieve space, we maximize the penalized empirical welfare, with the penalty determined by either the Rademacher complexity or a holdout method. We establish oracle inequalities that demonstrate the bias and variance trade-off achieved by the data-driven policy estimator. We also investigate two specific sieve selections: (1) a monotone single index model and (2) a systematic discretization method, which uses conventional sieve results for smooth functions such as linear sieves and deep neural networks. In the empirical study, we apply our method to examine the policy of assigning individuals to job training of different lengths.
科研通智能强力驱动
Strongly Powered by AbleSci AI