Hide & Share: Landmark-Based Similarity for Private KNN Computation
نویسندگان
چکیده
Computing k-nearest-neighbor graphs constitutes a fundamental operation in a variety of data-mining applications. As a prominent example, user-based collaborative-filtering provides recommendations by identifying the items appreciated by the closest neighbors of a target user. As this kind of applications evolve, they will require KNN algorithms to operate on more and more sensitive data. This has prompted researchers to propose decentralized peer-to-peer KNN solutions that avoid concentrating all information in the hands of one central organization. Unfortunately, such decentralized solutions remain vulnerable to malicious peers that attempt to collect and exploit information on participating users. In this paper, we seek to overcome this limitation by proposing H&S (Hide & Share), a novel landmark-based similarity mechanism for decentralized KNN computation. Landmarks allow users (and the associated peers) to estimate how close they lay to one another without disclosing their individual profiles. We evaluate H&S in the context of a user-based collaborativefiltering recommender with publicly available traces from existing recommendation systems. We show that although landmarkbased similarity does disturb similarity values (to ensure privacy), the quality of the recommendations is not as significantly hampered. We also show that the mere fact of disturbing similarity values turns out to be an asset because it prevents a malicious user from performing a profile reconstruction attack against other users, thus reinforcing users’ privacy. Finally, we provide a formal privacy guarantee by computing an upper bound on the amount of information revealed by H&S about a user’s profile. Keywords—Data privacy, Nearest neighbor searches, Peer-topeer computing, Recommender systems
منابع مشابه
Blind evaluation of location based queries using space transformation to preserve location privacy
In this paper we propose a fundamental approach to perform the class of Range and Nearest Neighbor (NN) queries, the core class of spatial queries used in location-based services, without revealing any location information about the query in order to preserve users’ private location information. The idea behind our approach is to utilize the power of one-way transformations to map the space of ...
متن کاملSpeeding up Memory-based Collaborative Filtering with Landmarks
Recommender systems play an important role in many scenarios where users are overwhelmed with too many choices to make. In this context, Collaborative Filtering (CF) arises by providing a simple and widely used approach for personalized recommendation. Memory-based CF algorithms mostly rely on similarities between pairs of users or items, which are posteriorly employed in classifiers like k-Nea...
متن کاملFault Detection Using the Clustering-kNN Rule for Gas Sensor Arrays
The k-nearest neighbour (kNN) rule, which naturally handles the possible non-linearity of data, is introduced to solve the fault detection problem of gas sensor arrays. In traditional fault detection methods based on the kNN rule, the detection process of each new test sample involves all samples in the entire training sample set. Therefore, these methods can be computation intensive in monitor...
متن کاملGorder: An Efficient Method for KNN Join Processing
An important but very expensive primitive operation of high-dimensional databases is the KNearest Neighbor (KNN) similarity join. The operation combines each point of one dataset with its KNNs in the other dataset and it provides more meaningful query results than the range similarity join. Such an operation is useful for data mining and similarity search. In this paper, we propose a novel KNN-...
متن کاملRange-kNN queries with privacy protection in a mobile environment
With the help of location-based services (LBS), mobile users are able to access their actual locations, which can be used to search for information around them which they are interested in. One typical thing is that mobile users are more likely to protect their personal information such as their actual locations. In order to protect the privacy of users’ personal information, we proposed Range-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015