Private Topic Modeling
نویسندگان
چکیده
We develop a privatised stochastic variational inference method for Latent Dirichlet Allocation (LDA). The iterative nature of stochastic variational inference presents challenges: multiple iterations are required to obtain accurate posterior distributions, yet each iteration increases the amount of noise that must be added to achieve a reasonable degree of privacy. We propose a practical algorithm that overcomes this challenge by combining: (1) A relaxed notion of the differential privacy, called concentrated differential privacy, which provides high probability bounds for cumulative privacy loss, which is well suited for iterative algorithms, rather than focusing on single-query loss; and (2) privacy amplification resulting from subsampling of large-scale data. Focusing on conjugate exponential family models, in our private variational inference, all the posterior distributions will be privatised by simply perturbing expected sufficient statistics. Using Wikipedia data, we illustrate the effectiveness of our algorithm for large-scale data.
منابع مشابه
Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کاملDistributed LDA-based Topic Modeling and Topic Agglomeration in a Latent Space
We describe the methodology that we followed to automatically extract topics corresponding to known events provided by the SNOW 2014 challenge in the context of the SocialSensor project. A data crawling tool and selected filtering terms were provided to all the teams. The crawled data was to be divided in 96 (15-minute) timeslots spanning a 24 hour period and participants were asked to produce ...
متن کاملTopic Modeling and Classification of Cyberspace Papers Using Text Mining
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...
متن کاملPrivate cryptocurrency versus central bank digital money: Evolutionary game theory modeling of the distribution of Seigniorage Shares
When the monopoly of money creation is removed and private money can be exchanged between people, the issue of Seigniorage share will arise, which is currently conceivable with the advent of cryptocurrencies. The question of the present study is that if we are in a situation where private cryptocurrencies along with money are common in the society with the state publisher, what share of the Sei...
متن کاملPolicy-Based Automation of Dynamique and Multipoint Virtual Private Network Simulation on OPNET Modeler
The simulation of large-scale networks is a challenging task especially if the network to simulate is the Dynamic Multipoint Virtual Private Network, it requires expert knowledge to properly configure its component technologies. The study of these network architectures in a real environment is almost impossible because it requires a very large number of equipment, however, this task is feasible...
متن کاملModeling the operational risk in Iranian commercial banks: case study of a private bank
The Basel Committee on Banking Supervision from the Bank for International Settlement classifies banking risks into three main categories including credit risk, market risk, and operational risk. The focus of this study is on the operational risk measurement in Iranian banks. Therefore, issues arising when trying to implement operational risk models in Iran are discussed, and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1609.04120 شماره
صفحات -
تاریخ انتشار 2016