نتایج جستجو برای: minimax regret

تعداد نتایج: 12162  

2004
LAWRENCE V. SNYDER MARK S. DASKIN

The two most widely considered measures for optimization under uncertainty are minimizing expected cost and minimizing worst-case cost or regret. In this paper, we present a novel robustness measure that combines the two objectives by minimizing the expected cost while bounding the relative regret in each scenario. In particular, the models seek the minimum-expected-cost solution that is p-robu...

2009
Kevin Regan Craig Boutilier

The specification of a Markov decision process (MDP) can be difficult. Reward function specification is especially problematic; in practice, it is often cognitively complex and time-consuming for users to precisely specify rewards. This work casts the problem of specifying rewards as one of preference elicitation and aims to minimize the degree of precision with which a reward function must be ...

2015
Alexandra Carpentier Michal Valko

We consider a stochastic bandit problem with infinitely many arms. In this setting, the learner has no chance of trying all the arms even once and has to dedicate its limited number of samples only to a certain number of arms. All previous algorithms for this setting were designed for minimizing the cumulative regret of the learner. In this paper, we propose an algorithm aiming at minimizing th...

Journal: :CoRR 2009
Jacob D. Abernethy Alekh Agarwal Peter L. Bartlett Alexander Rakhlin

We study the regret of optimal strategies for online convex optimization games. Using von Neumann’s minimax theorem, we show that the optimal regret in this adversarial setting is closely related to the behavior of the empirical minimization algorithm in a stochastic process setting: it is equal to the maximum, over joint distributions of the adversary’s action sequence, of the difference betwe...

Journal: :Journal of Machine Learning Research 2010
Jean-Yves Audibert Sébastien Bubeck

This work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: pseudoregret, expected regret, high probability regret and tracking the best expert regret. We introduce a new forecaster, INF (Implicitly Normalized Forecaster) based on an arbitrary function ψ for which we propose a u...

2006
TREVOR J. SWEETING GAURI S. DATTA MALAY GHOSH

We explore the construction of nonsubjective prior distributions in Bayesian statistics via a posterior predictive relative entropy regret criterion. We carry out a minimax analysis based on a derived asymptotic predictive loss function and show that this approach to prior construction has a number of attractive features. The approach here differs from previous work that uses either prior or po...

2006
Charles F. Manski

Consider the choice of a profiling policy where decisions to search for evidence of crime may vary with observable covariates of the persons at risk of search. I pose a planning problem whose objective is to minimise the social cost of crime and search. The consequences of a search rule depend on the extent to which search deters crime. I study the planning problem when the planner has partial ...

2009
Jean-Yves Audibert Sébastien Bubeck

This work deals with four classical prediction games, namely full information, bandit and label efficient (full information or bandit) games as well as three different notions of regret: pseudo-regret, expected regret and tracking the best expert regret. We introduce a new forecaster, INF (Implicitly Normalized Forecaster) based on an arbitrary function ψ for which we propose a unified analysis...

2006
Trevor J. Sweeting Gauri S. Datta

We explore the construction of nonsubjective prior distributions in Bayesian statistics via a posterior predictive relative entropy regret criterion. We carry out a minimax analysis based on a derived asymptotic predictive loss function and show that this approach to prior construction has a number of attractive features. The approach here differs from previous work that uses either prior or po...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید