杠杆(统计)
计算机科学
人工智能
机器学习
选型
甲骨文公司
领域(数学分析)
协变量
域适应
特征选择
标记数据
数据挖掘
数学
分类器(UML)
数学分析
软件工程
作者
Trent Kyono,Mihaela van der Schaar
出处
期刊:IEEE transactions on artificial intelligence
[Institute of Electrical and Electronics Engineers]
日期:2021-08-02
卷期号:2 (6): 494-507
被引量:10
标识
DOI:10.1109/tai.2021.3101185
摘要
In many real-world settings, such as healthcare, machine learning models are trained and validated on one labeled domain and tested or deployed on another, where feature distributions differ, i.e., there is covariate shift . When annotations are costly or prohibitive, an unsupervised domain adaptation (UDA) regime can be leveraged requiring only unlabeled samples in the target domain. Existing UDA methods are unable to factor in a model's predictive loss based on predictions in the target domain and, therefore, suboptimally leverage density ratios of only the input covariates in each domain. In this article, we propose a model selection method for leveraging model predictions on a target domain without labels by exploiting the domain invariance of causal structure. We assume or learn a causal graph from the source domain and select models that produce predicted distributions in the target domain that have the highest likelihood of fitting our causal graph. We thoroughly analyze our method under oracle knowledge using synthetic data. We then show on several real-world datasets, including several COVID-19 examples, that our method is able to improve on the state-of-the-art UDA algorithms for model selection.
科研通智能强力驱动
Strongly Powered by AbleSci AI