Privacy Preservation through Random Non-linear Data Distortion
ثبت نشده
چکیده
Consider a scenario in which the data owner has some private/sensitive data and wants a data miner to access it for studying “important” patterns without revealing the sensitive information. Privacy preserving data mining aims to solve this problem by randomly transforming (distorting) the data prior to its release. Previous work only considered the case of linear distortions — additive, multiplicative or a combination of both — for studying the usefulness of the distorted output and the privacy preserved. In this paper, we consider a general class of potentially non-linear transformations of the data. We develop bounds on the expected accuracy of our non-linear distortion and also quantify privacy by using standard definitions. We show how our general transformation can be used in practice for two specific problem instances: a linear model and a popular non-linear model viz. neural network. The paper presents a thorough theoretical analysis of the transformation and possible applications. Experiments conducted on real-life datasets demonstrate the effectiveness of the approach.
منابع مشابه
On the Privacy Preserving Properties of Random Data Perturbation Techniques
Privacy is becoming an increasingly important issue in many data mining applications. This has triggered the development of many privacy-preserving data mining techniques. A large fraction of them use randomized data distortion techniques to mask the data for preserving the privacy of sensitive data. This methodology attempts to hide the sensitive data by randomly modifying the data values ofte...
متن کاملHybrid Perturbation Technique using Feature Selection Method for Privacy Preservation in Data Mining
Privacy-preserving in data mining refers to the area of data mining that seeks to safeguard sensitive information from unsolicited or unsanctioned disclosure and hence protecting individual data records and their privacy. Data perturbation is a privacy preservation technique which does addition / multiplication of noise to the original data. It performs anonymization based on the data type of s...
متن کاملHomeland Defense, Privacy-Sensitive Data Mining, and Random Value Distortion
Data mining is playing an increasingly important role in sifting through large amount of data for homeland defense applications. However, we must pay attention to the privacy issues while mining the data. This has resulted in the development of several privacy-preserving data mining techniques. The random value distortion technique is one among them. It attempts to hide the sensitive data by ra...
متن کاملWorkshop on Data Mining for Counter Terrorism and Security
Data mining is playing an increasingly important role in sifting through large amount of data for homeland defense applications. However, we must pay attention to the privacy issues while mining the data. This has resulted in the development of several privacy-preserving data mining techniques. The random value distortion technique is one among them. It attempts to hide the sensitive data by ra...
متن کاملFeature Selection: A Preprocess for Data Perturbation
As a major concern in designing various data mining applications, privacy preservation has become a critical component seeking a trade-off between mining performances and protecting sensitive information. Data perturbation or distortion is a widely used approach for privacy protection. Many privacy preservation approaches were developed, either by adding noises or by matrix decomposition method...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008