计算机科学
协议(科学)
计算机安全
计算机网络
互联网隐私
医学
替代医学
病理
作者
Zongda Han,Xiang Cheng,Wenhong Zhao,Jiaxin Fu,Zhaofeng He,Sen Su
摘要
Extreme Gradient Boosting (XGBoost) demonstrates excellent performance in practice and is widely used in both industry and academic research. This extensive application has led to a growing interest in employing multi-party data to develop more robust XGBoost models. In response to increasing concerns about privacy leakage, secure vertical federated XGBoost is proposed. It employs secure multi-party computation techniques, such as secret sharing (SS), to allow multiple parties holding vertically partitioned data, i.e., disjoint features on the same samples, to collaborate in constructing an XGBoost model. However, the running efficiency is the primary obstacle to the practical application of existing protocols, especially in multi-party settings. The reason is that these protocols not only require the execution of data-oblivious computations to protect intermediate results, leading to high computational complexity, but also involve a large number of SS-based non-linear operations with high overheads, e.g., division operations in gain score calculation and comparison operations in best split selection. To this end, we present a secure and efficient multi-party protocol for vertical federated XGBoost, called SecureXGB, which can perform the collaborative training of an XGBoost model in an SS-friendly manner. In SecureXGB, we first propose a parallelizable multi-party permutation method, which can secretly and efficiently permute all samples before model training to reduce the reliance on data-oblivious computations. Then, we design a linear gain score that can be evaluated without involving division operations and has equivalent utility to the original gain score. Finally, we develop a synchronous best split selection method to secretly identify the best split with the maximum gain score using a minimal number of comparison operations. Experimental results demonstrate that SecureXGB can achieve better training efficiency than state-of-the-art protocols without the loss of model accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI