A linear response bandit problem

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Linear Programming Relaxation and a Heuristic for the Restless Bandit Problem with General Switching Costs

We extend a relaxation technique due to Bertsimas and Niño-Mora for the restless bandit problem to the case where arbitrary costs penalize switching between the bandits. We also construct a one-step lookahead policy using the solution of the relaxation. Computational experiments and a bound for approximate dynamic programming provide some empirical support for the heuristic.

متن کامل

A Lemma on the Multiarmed Bandit Problem

We prove a lemma on the optimal value function for the mdtiarmed bandit problem which provides a simple direct proof of optimality of writeoff policies. This, in turn, leads to a new proof of optimality of the index rule.

متن کامل

Fast Generalized Stochastic Linear Bandit

We study a generalized stochastic linear bandit problem and propose an algorithm 1 that enjoys fast update. The computational complexity of the update is O(d), 2 where d is the dimension of a context space. In comparison with other stochastic 3 linear bandit algorithms, our algorithm does not need to incrementally update the 4 inverse of a matrix so that it can avoid the O(d) computations. Yet,...

متن کامل

The Nonstochastic Multiarmed Bandit Problem

In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the trade-off between exploration (trying out each arm to find the best one) and exploitation (playing the arm believed to give the best payoff...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Stochastic Systems

سال: 2013

ISSN: 1946-5238

DOI: 10.1214/11-ssy032