On the optimality of the Gittins index rule for multi-armed bandits with multiple plays

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the optimality of the Gittins index rule for multi-armed bandits with multiple plays

We investigate the general multi-armed bandit problem with multiple servers. We determine a condition on the reward processes su1⁄2cient to guarantee the optimality of the strategy that operates at each instant of time the projects with the highest Gittins indices. We call this strategy the Gittins index rule for multi-armed bandits with multiple plays, or brie ̄y the Gittins index rule. We show...

متن کامل

Budgeted Multi-Armed Bandits with Multiple Plays

We study the multi-play budgeted multi-armed bandit (MP-BMAB) problem, in which pulling an arm receives both a random reward and a random cost, and a player pulls L( 1) arms at each round. The player targets at maximizing her total expected reward under a budget constraint B for the pulling costs. We present a multiple ratio confidence bound policy: At each round, we first calculate a truncated...

متن کامل

Multi-Armed Bandits, Gittins Index, and its Calculation

Multi-armed bandit is a colorful term that refers to the di lemma faced by a gambler playing in a casino with multiple slot machines (which were colloquially called onearmed bandits). W h a t strategy should a gambler use to pick the machine to play next? It is the one for which the posterior mean of winning is the highest and thereby maximizes current expected reward, or the one for which the ...

متن کامل

Regret Analysis of the Finite-Horizon Gittins Index Strategy for Multi-Armed Bandits

I prove near-optimal frequentist regret guarantees for the finite-horizon Gittins index strategy for multi-armed bandits with Gaussian noise and prior. Along the way I derive finite-time bounds on the Gittins index that are asymptotically exact and may be of independent interest. I also discuss computational issues and present experimental results suggesting that a particular version of the Git...

متن کامل

Budget-Constrained Multi-Armed Bandits with Multiple Plays

We study the multi-armed bandit problem with multiple plays and a budget constraint for both the stochastic and the adversarial setting. At each round, exactly K out of N possible arms have to be played (with 1 ≤ K ≤ N ). In addition to observing the individual rewards for each arm played, the player also learns a vector of costs which has to be covered with an a-priori defined budget B. The ga...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematical Methods of Operations Research (ZOR)

سال: 1999

ISSN: 1432-2994,1432-5217

DOI: 10.1007/s001860050080