Optimal event sequence sanitization
نویسندگان
چکیده
Frequent event mining is a fundamental task to extract insight from an event sequence (long sequence of events that are associated with time points). However, it may expose sensitive events that leak confidential business knowledge or lead to intrusive inferences about groups of individuals. In this work, we aim to prevent this threat, by deleting occurrences of sensitive events, while preserving the utility of the event sequence. To quantify utility, we propose a model that captures changes, caused by deletion, to the probability distribution of events across the sequence. Based on the model, we define the problem of sanitizing an event sequence as an optimization problem. Solving the problem is important to preserve the output of many mining tasks, including frequent pattern mining and sequence segmentation. However, this is also challenging, due to the exponential number of ways to apply deletion to the sequence. To optimally solve the problem when there is one sensitive event, we develop an efficient algorithm based on dynamic programming. The algorithm also forms the basis of a simple, iterative method that optimally sanitizes an event sequence, when there are multiple sensitive events. Experiments on real and synthetic datasets show the effectiveness and efficiency of our method.
منابع مشابه
Optimal Sanitization Synthesis for Web Application Vulnerability Repair
We present a codeand input-sensitive sanitization synthesis approach for repairing string vulnerabilities that are common in web applications. The synthesized sanitization patch modifies the user input in an optimal way while guaranteeing that the repaired web application is not vulnerable. Given a web application, an input pattern and an attack pattern, we use automata-based static string anal...
متن کاملMICF: An effective sanitization algorithm for hiding sensitive patterns on data mining
Data mining mechanisms have widely been applied in various businesses and manufacturing companies across many industry sectors. Sharing data or sharing mined rules has become a trend among business partnerships, as it is perceived to be a mutually benefit way of increasing productivity for all parties involved. Nevertheless, this has also increased the risk of unexpected information leaks when ...
متن کاملA Novel Sanitization Approach for Privacy Preserving Utility Itemset Mining
Data mining plays a vital role in today’s information world wherein it has been widely applied in various business organizations. The current trend in business collaboration demands the need to share data or mined results to gain mutual benefit. However it has also raised a potential threat of revealing sensitive information when releasing data. Data sanitization is the process to conceal the s...
متن کاملRanking of Fire Stations with Fibonacci Sequence Technique, Case Study: District Ten of Tehran Municipality
One of the effective items to reduce time for arriving fire fighters to place of event is determining the optimal location of fire stations. Ranking can define the best location of a fire station through the available options. The case of this study is the district ten of Tehran municipality. That is the smallest district of Tehran municipality in terms of size and is highest in terms of densit...
متن کاملAn Empirical Analysis of XSS Sanitization in Web Application Frameworks
Filtering or sanitization is the predominant mechanism in today’s applications to defend against cross-site scripting (XSS) attacks. XSS sanitization can be difficult to get right as it ties in closely with the parsing behavior of the browser. This paper explains some of the subtleties of ensuring correct sanitization, as well as common pitfalls. We study several emerging web application framew...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015