激励
计算机科学
算法
机器学习
人工智能
数据挖掘
数据科学
经济
微观经济学
作者
Zan Zhang,Guannan Liu,Junjie Wu,Yong Tan
出处
期刊:Social Science Research Network
[Social Science Electronic Publishing]
日期:2022-01-01
被引量:4
摘要
Training effective machine learning algorithms requires inhomogeneous data from different sources, which may violate data protection regulations. Federated learning (FedL), emerged as a privacy-aware alternative, enables collaborative model training without accessing the original data. One of the key challenges in FedL is how to incentivize clients with huge amounts of data to train algorithms on their local datasets. In this paper, we examine data pricing by designing incentive schemes under a FedL framework, which involves a platform that publishes a collaborative learning algorithm training task, and multiple data providers who are compensated for their data contributions. In particular, we analytically investigate how the interplay between moral hazard and the degree of complementarity across data varieties affects the platform's optimal choice of incentive schemes and the number of participants joining the algorithm training. Our analysis shows that it is optimal for the platform to offer data providers an individual output-based scheme when there are strong synergies between the varieties of data contributed by the providers. The marginal improvement-based scheme is optimal when there is strong substitutability across data varieties. Our result shows that the performance-based scheme is dominated and yields a lower profit and lower data contributions. By examining the optimal number of participants for the algorithm training, we find that when data varieties become more substitutable, the platform should reduce the group size under the individual output-based scheme but expand the group size under the marginal improvement-contingent scheme if the degree of complementarity across data varieties is low enough.
科研通智能强力驱动
Strongly Powered by AbleSci AI