差别隐私
计算机科学
信息隐私
私人信息检索
MNIST数据库
设计隐私
机器学习
过程(计算)
隐私保护
信息敏感性
隐私政策
数据挖掘
人工智能
计算机安全
互联网隐私
深度学习
操作系统
作者
Franziska Boenisch,Christopher Mühl,Roy Rinberg,Jannis Ihrig,Adam Dziedzic
出处
期刊:Cornell University - arXiv
日期:2022-01-01
被引量:1
标识
DOI:10.48550/arxiv.2202.10517
摘要
Applying machine learning (ML) to sensitive domains requires privacy protection of the underlying training data through formal privacy frameworks, such as differential privacy (DP). Yet, usually, the privacy of the training data comes at the cost of the resulting ML models' utility. One reason for this is that DP uses one uniform privacy budget epsilon for all training data points, which has to align with the strictest privacy requirement encountered among all data holders. In practice, different data holders have different privacy requirements and data points of data holders with lower requirements can contribute more information to the training process of the ML models. To account for this need, we propose two novel methods based on the Private Aggregation of Teacher Ensembles (PATE) framework to support the training of ML models with individualized privacy guarantees. We formally describe the methods, provide a theoretical analysis of their privacy bounds, and experimentally evaluate their effect on the final model's utility using the MNIST, SVHN, and Adult income datasets. Our empirical results show that the individualized privacy methods yield ML models of higher accuracy than the non-individualized baseline. Thereby, we improve the privacy-utility trade-off in scenarios in which different data holders consent to contribute their sensitive data at different individual privacy levels.
科研通智能强力驱动
Strongly Powered by AbleSci AI