Protecting Output Privacy in Stream Mining
ثبت نشده
چکیده
Privacy preservation in data mining demands protecting both input and output privacy. The former refers to sanitizing the raw data itself before performing mining. The latter refers to preventing the mining output (model/pattern) from malicious pattern-based inference attacks. The preservation of input privacy does not necessarily lead to that of output privacy. This work studies the problem of protecting output privacy in the context of frequent pattern mining on data streams. After exposing the privacy breaches existing in current stream mining systems, we propose Butterfly, a light-weighted countermeasure that can effectively eliminate these breaches without explicitly detecting them, meanwhile minimizing the loss of the output accuracy. We further optimize the basic scheme by taking account of two types of semantic constraints, aiming at maximally preserving utility-related semantics while maintaining the hard privacy and accuracy guarantee. We conduct extensive experiments over real-life datasets to show the effectiveness and efficiency of our approach.
منابع مشابه
Technical Report: Output Privacy Protection in Stream Mining
Privacy preservation in data mining demands protecting both input and output privacy. The former refers to sanitizing the raw data itself before performing mining. The latter refers to preventing the mining output (model/pattern) from malicious pattern-based inference attacks. The preservation of input privacy does not necessarily lead to that of output privacy. This work studies the problem of...
متن کاملPrivacy-preserving Clustering of Data Streams
As most previous studies on privacy-preserving data mining placed specific importance on the security of massive amounts of data from a static database, consequently data undergoing privacy-preservation often leads to a decline in the accuracy of mining results. Furthermore, following by the rapid advancement of Internet and telecommunication technology, subsequently data types have transformed...
متن کاملCandidate Pruning-Based Differentially Private Frequent Itemsets Mining
Frequent Itemsets Mining(FIM) is a typical data mining task and has gained much attention. Due to the consideration of individual privacy, various studies have been focusing on privacy-preserving FIM problems. Differential privacy has emerged as a promising scheme for protecting individual privacy in data mining against adversaries with arbitrary background knowledge. In this paper, we present ...
متن کاملA Heuristic Approach to Preserve Privacy in Stream Data with Classification
Data stream Mining is new era in data mining field. Numerous algorithms are used to extract knowledge and classify stream data. Data stream mining gives birth to a problem threat of data privacy. Traditional algorithms are not appropriate for stream data due to large scale. To build classification model for large scale also required some time constraints which is not fulfilled by traditional al...
متن کاملGeometric Data Perturbation Techniques in Privacy Preserving On Data Stream Mining
Data mining is the information technology that extracts valuable knowledge from large amounts of data. Due to the emergence of data streams as a new type of data, data stream mining has recently become a very important and popular research issue. Privacy preservation issue of data streams mining is very important issue, in this dissertation work, an approach based on Geometric data perturbation...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007