同态加密
计算机科学
Paillier密码体制
随机森林
安全多方计算
数据挖掘
加密
推论
决策树
信息隐私
树(集合论)
私人信息检索
密码系统
理论计算机科学
机器学习
计算机安全
密码学
人工智能
数学
数学分析
混合密码体制
作者
Qianying Liao,Bruno Cabral,João Paulo Fernandes,Nuno Lourenço
标识
DOI:10.1109/ijcnn55064.2022.9892321
摘要
Building a Machine Learning model requires the use of large amounts of data. Due to privacy and regulatory concerns, these data might be owned by multiple sites and are often not mutually shareable. Our work deals with private learning and inference for the Weighted Random Forest model when data records are vertically distributed among multiple sites. Previous privacy-preserving vertical tree-based frameworks either adapt Secure Multi-party Computation or share intermediate results and are hard to generalize or scale. In contrast, our proposal contains efficient collaborative calculation algorithms of the Gini Index and Entropy for computing the impurity of decision tree nodes while protecting all intermediate values and disclosing minimal information. We offer a learning protocol based on the Paillier Cryptosystem and Digital Envelope. Also, we provide an inference protocol found on the Look-up Table. Our experiments show that the proposed protocols do not cause predictive performance loss while still establishing and utilizing the model within a reasonable time. The results imply that practitioners can overcome the barrier of data sharing and produce random forest models for data-heavy domains with strict privacy requirements, such as Health Prediction, Fraud Detection, and Risk Evaluation.
科研通智能强力驱动
Strongly Powered by AbleSci AI