Technical Report: Output Privacy Protection in Stream Mining
نویسندگان
چکیده
Privacy preservation in data mining demands protecting both input and output privacy. The former refers to sanitizing the raw data itself before performing mining. The latter refers to preventing the mining output (model/pattern) from malicious pattern-based inference attacks. The preservation of input privacy does not necessarily lead to that of output privacy. This work studies the problem of protecting output privacy in the context of frequent pattern mining over data streams. After exposing the privacy breaches existing in current stream mining systems, we propose Butterfly, a light-weighted countermeasure that can effectively eliminate these breaches without explicitly detecting them, meanwhile minimizing the loss of the output accuracy. We further optimize the basic scheme by taking account of two types of semantic constraints, aiming at maximally preserving utilityrelated semantics while maintaining the hard privacy and accuracy guarantee. We conduct extensive experiments over real-life datasets to show the effectiveness and efficiency of our approach.
منابع مشابه
Protecting Output Privacy in Stream Mining
Privacy preservation in data mining demands protecting both input and output privacy. The former refers to sanitizing the raw data itself before performing mining. The latter refers to preventing the mining output (model/pattern) from malicious pattern-based inference attacks. The preservation of input privacy does not necessarily lead to that of output privacy. This work studies the problem of...
متن کاملPrivacy-preserving Clustering of Data Streams
As most previous studies on privacy-preserving data mining placed specific importance on the security of massive amounts of data from a static database, consequently data undergoing privacy-preservation often leads to a decline in the accuracy of mining results. Furthermore, following by the rapid advancement of Internet and telecommunication technology, subsequently data types have transformed...
متن کاملOutput Privacy Protection With Pattern-Based Heuristic Algorithm
Privacy Preserving Data Mining(PPDM) is an ongoing research area aimed at bridging the gap between the collaborative data mining and data confidentiality There are many different approaches which have been adopted for PPDM, of them the rule hiding approach is used in this article. This approach ensures output privacy that prevent the mined patterns(itemsets) from malicious inference problems. A...
متن کاملA Heuristic Approach to Preserve Privacy in Stream Data with Classification
Data stream Mining is new era in data mining field. Numerous algorithms are used to extract knowledge and classify stream data. Data stream mining gives birth to a problem threat of data privacy. Traditional algorithms are not appropriate for stream data due to large scale. To build classification model for large scale also required some time constraints which is not fulfilled by traditional al...
متن کاملAnalysis of Email Fraud detection using WEKA Tool
—Data mining is also being useful to give solutions for invasion finding and auditing. While data mining has several applications in protection, there are also serious privacy fears. Because of email mining, even inexperienced users can connect data and make responsive associations. Therefore we must to implement the privacy of persons while working on practical data mining. Using K-mean cluste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007