Scalable Privacy-Preserving Distributed Learning

نویسندگان

چکیده

Abstract In this paper, we address the problem of privacy-preserving distributed learning and evaluation machine-learning models by analyzing it in widespread MapReduce abstraction that extend with privacy constraints. We design spindle (Scalable Privacy-preservINg Distributed LEarning), first system covers complete ML workflow enabling execution a cooperative gradient-descent obtained model preserving data confidentiality passive-adversary up to N ?1 colluding parties. uses multiparty homomorphic encryption execute parallel high-depth computations on encrypted without significant overhead. instantiate for training generalized linear datasets show is able accurately (on par non-secure centrally-trained models) efficiently (due multi-level parallelization computations) train require high number iterations large input thousands features, among hundreds providers. For instance, trains logistic-regression dataset one million samples 32 features 160 providers less than three minutes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Privacy - Preserving Distributed Computation

P4P: A Practical Framework for Privacy-Preserving Distributed Computation

متن کامل

Privacy-preserving distributed clustering

Clustering is a very important tool in data mining and is widely used in on-line services for medical, financial and social environments. The main goal in clustering is to create sets of similar objects in a data set. The data set to be used for clustering can be owned by a single entity, or in some cases, information from different databases is pooled to enrich the data so that the merged data...

متن کامل

Privacy-Preserving Bayesian Network Learning From Heterogeneous Distributed Data

In this paper, we propose a post randomization technique to learn a Bayesian network (BN) from distributed heterogeneous data, in a privacy sensitive fashion. In this case, two or more parties own sensitive data but want to learn a Bayesian network from the combined data. We consider both structure and parameter learning for the BN. The only required information from the data set is a set of su...

متن کامل

Fully Distributed Privacy Preserving Mini-batch Gradient Descent Learning

In fully distributed machine learning, privacy and security are important issues. These issues are often dealt with using secure multiparty computation (MPC). However, in our application domain, known MPC algorithms are not scalable or not robust enough. We propose a light-weight protocol to quickly and securely compute the sum of the inputs of a subset of participants assuming a semi-honest ad...

متن کامل

Privacy-Preserving Distributed Event Corroboration

Privacy-Preserving Distributed Event Correlation Janak J. Parekh Event correlation is a widely-used data processing methodology, and is useful for the distributed monitoring of software faults and vulnerabilities. Most existing solutions have focused on “intra-organizational” correlation; organizations typically employ privacy policies that prohibit the exchange of information outside of the or...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings on Privacy Enhancing Technologies

سال: 2021

ISSN: ['2299-0984']

DOI: https://doi.org/10.2478/popets-2021-0030