An Experimental Design Approach for Regret Minimization in Logistic Bandits

نویسندگان

چکیده

In this work we consider the problem of regret minimization for logistic bandits. The main challenge bandits is reducing dependence on a potentially large dependent constant that can at worst scale exponentially with norm unknown parameter vector. Previous works have applied self-concordance function to remove worst-case providing guarantees move reduce case lower order terms only polylogarithmic term and as well linear dimension parameter. This improves upon prior art by 1) removing all scaling 2) square root in fixed arm setting employing an experimental design procedure. Our bound fact takes tighter instance (i.e., gap) first time We also propose new warmup sampling algorithm dramatically general prove it term's dependency some instances. Finally, discuss impact bias MLE bandit d dimensions, example where d^2 (cf., bandits) may not be improved long used how bias-corrected estimators make closer d.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gaussian Process Bandits: An Experimental Design Approach

We consider the online setting of optimizing an unknown function, sampled from a Gaussian Proccess, over a given bounded decision set so that that our cumulative regret is low. Our analysis of an upper confidence algorithm provides sublinear regret bounds for popular classes of kernels by exploiting a surprising connection to optimal experimental design– in particular, our rates do no explicitl...

متن کامل

Near-Optimal Discrete Optimization for Experimental Design: A Regret Minimization Approach

The experimental design problem concerns the selection of k points from a potentially large design pool of p-dimensional vectors, so as to maximize the statistical efficiency regressed on the selected k design points. Statistical efficiency is measured by optimality criteria, including A(verage), D(eterminant), T(race), E(igen), V(ariance) and G-optimality. Except for the Toptimality, exact opt...

متن کامل

Regret of Queueing Bandits

We consider a variant of the multiarmed bandit problem where jobs queue for service, and service rates of different servers may be unknown. We study algorithms that minimize queue-regret: the (expected) difference between the queue-lengths obtained by the algorithm, and those obtained by a “genie”-aided matching algorithm that knows exact service rates. A naive view of this problem would sugges...

متن کامل

A Regret Minimization Approach in Product Portfolio Management with respect to Customers’ Price-sensitivity

In an uncertain and competitive environment, product portfolio management (PPM) becomes more challenging for manufacturers to decide what to make and establish the most beneficial product portfolio. In this paper, a novel approach in PPM is proposed in which the environment uncertainty, competitors’ behavior and customer’s satisfaction are simultaneously considered as the most important criteri...

متن کامل

Random Walk Approach to Regret Minimization

We propose a computationally efficient random walk on a convex body which rapidly mixes to a time-varying Gibbs distribution. In the setting of online convex optimization and repeated games, the algorithm yields low regret and presents a novel efficient method for implementing mixture forecasting strategies.

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i7.20741