推论
MNIST数据库
计算机科学
机器学习
人工智能
集合(抽象数据类型)
启发式
方案(数学)
过程(计算)
训练集
数据集
数据挖掘
深度学习
数学
操作系统
数学分析
程序设计语言
作者
Zhaobo Lu,Hai Liang,Minghao Zhao,Qingzhe Lv,Tiancai Liang,Yilei Wang
摘要
Machine unlearning is the process through which a deployed machine learning model is enforced to forget about some of its training data items. It normally generates two machine learning models, the original model and the unlearned model, indicating training results before and after data items are deleted. However, recent studies find that machine unlearning is vulnerable to membership inference attacks—as the directivity of training and nontraining data (i.e., data items in the training set have high posterior probabilities), the attackers can utilize this property to infer whether an item has been used for original model training. Nevertheless, such attacks are incapable in label-only settings, in which the attackers are infeasible to get the posteriors. In this paper, we propose a new label-only membership inference attack scheme targeted at machine unlearning to eliminate the dependence on posteriors. Our heuristic is that injected turbulence on candidate samples will present different behaviors for training and nontraining data. Thus, in our scheme, the attacker iteratively query on the original/unlearned models and inject turbulence to change their predicting labels; it determines whether an item is having-been-delated by observing the disturbance amplitude. Extensive experiments (i.e., on MNIST, CIFAR10, CIFAR100, and STL10 data sets) show that our method achieves high inference accuracy (measured by AUC) in label-only settings, for example, AUC = 0.96 for MNIST data set. Besides, we analyze the existing countermeasures in mitigating inference attacks and find that our scheme can bypass most of them.
科研通智能强力驱动
Strongly Powered by AbleSci AI