Bayesian Nonparametric Collaborative Topic Poisson Factorization for Electronic Health Records-Based Phenotyping
نویسندگان
چکیده
Phenotyping with electronic health records (EHR) has received much attention in recent years because the phenotyping opens a new way to discover clinically meaningful insights, such as disease progression and disease subtypes without human supervisions. In spite of its potential benefits, the complex nature of EHR often requires more sophisticated methodologies compared with traditional methods. Previous works on EHR-based phenotyping utilized unsupervised and supervised learning methods separately by independently detecting phenotypes and predicting medical risk scores. To improve EHR-based phenotyping by bridging the separated methods, we present Bayesian nonparametric collaborative topic Poisson factorization (BN-CTPF) that is the first nonparametric contentbased Poisson factorization and first application of jointly analyzing the phenotye topics and estimating the individual risk scores. BN-CTPF shows better performances in predicting the risk scores when we compared the model with previous matrix factorization and topic modeling methods including a Poisson factorization and its collaborative extensions. Also, BN-CTPF provides faceted views on the phenotype topics by patients’ demographics. Finally, we demonstrate a scalable stochastic variational inference algorithm by applying BN-CTPF to a national-scale EHR dataset.
منابع مشابه
Bayesian Nonparametric Poisson Factorization for Recommendation Systems
We develop a Bayesian nonparametric Poisson factorization model for recommendation systems. Poisson factorization implicitly models each user’s limited budget of attention (or money) that allows consumption of only a small subset of the available items. In our Bayesian nonparametric variant, the number of latent components is theoretically unbounded and effectively estimated when computing a po...
متن کاملNonparametric Bayesian Matrix Factorization by Power-EP
Many real-world applications can be modeled by matrix factorization. By approximating an observed data matrix as the product of two latent matrices, matrix factorization can reveal hidden structures embedded in data. A common challenge to use matrix factorization is determining the dimensionality of the latent matrices from data. Indian Buffet Processes (IBPs) enable us to apply the nonparametr...
متن کاملNonparametric Max-Margin Matrix Factorization for Collaborative Prediction
We present a probabilistic formulation of max-margin matrix factorization and build accordingly a nonparametric Bayesian model which automatically resolves the unknown number of latent factors. Our work demonstrates a successful example that integrates Bayesian nonparametrics and max-margin learning, which are conventionally two separate paradigms and enjoy complementary advantages. We develop ...
متن کاملGamma Processes, Stick-Breaking, and Variational Inference
While most Bayesian nonparametric models in machine learning have focused on the Dirichlet process, the beta process, or their variants, the gamma process has recently emerged as a useful nonparametric prior in its own right. Current inference schemes for models involving the gamma process are restricted to MCMC-based methods, which limits their scalability. In this paper, we present a variatio...
متن کاملBayesian change point estimation in Poisson-based control charts
Precise identification of the time when a process has changed enables process engineers to search for a potential special cause more effectively. In this paper, we develop change point estimation methods for a Poisson process in a Bayesian framework. We apply Bayesian hierarchical models to formulate the change point where there exists a step < /div> change, a linear trend and a known multip...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016