Semi-parametric contextual bandits with graph-Laplacian regularization

نویسندگان

چکیده

Non-stationarity is ubiquitous in human behavior and addressing it the contextual bandits challenging. Several works have addressed problem by investigating semi-parametric warned that ignoring non-stationarity could harm performances. Another prevalent social interaction which has become available a form of network or graph structure. As result, graph-based received much attention. In this paper, we propose SemiGraphTS, novel Thompson-sampling algorithm for reward model. Our first to be proposed setting. We derive an upper bound cumulative regret can expressed as multiple factor depending on structure order model without graph. evaluate existing algorithms via simulation real data example.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning on Graph with Laplacian Regularization

We consider a general form of transductive learning on graphs with Laplacian regularization, and derive margin-based generalization bounds using appropriate geometric properties of the graph. We use this analysis to obtain a better understanding of the role of normalization of the graph Laplacian matrix as well as the effect of dimension reduction. The results suggest a limitation of the standa...

متن کامل

Motion deblurring with graph Laplacian regularization

In this paper, we develop a regularization framework for image deblurring based on a new definition of the normalized graph Laplacian. We apply a fast scaling algorithm to the kernel similarity matrix to derive the symmetric, doubly stochastic filtering matrix from which the normalized Laplacian matrix is built. We use this new definition of the Laplacian to construct a cost function consisting...

متن کامل

Entropic Graph Regularization in Non-Parametric Semi-Supervised Classification

We prove certain theoretical properties of a graph-regularized transductive learning objective that is based on minimizing a Kullback-Leibler divergence based loss. These include showing that the iterative alternating minimization procedure used to minimize the objective converges to the correct solution and deriving a test for convergence. We also propose a graph node ordering algorithm that i...

متن کامل

Bayesian Regularization via Graph Laplacian

Regularization plays a critical role in modern statistical research, especially in high dimensional variable selection problems. Existing Bayesian methods usually assume independence between variables a priori. In this article, we propose a novel Bayesian approach, which explicitly models the dependence structure through a graph Laplacian matrix. We also generalize the graph Laplacian to allow ...

متن کامل

Linear Contextual Bandits with Knapsacks

We consider the linear contextual bandit problem with resource consumption, in addition to reward generation. In each round, the outcome of pulling an arm is a reward as well as a vector of resource consumptions. The expected values of these outcomes depend linearly on the context of that arm. The budget/capacity constraints require that the total consumption doesn’t exceed the budget for each ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information Sciences

سال: 2023

ISSN: ['0020-0255', '1872-6291']

DOI: https://doi.org/10.1016/j.ins.2023.119367