regret minimization

نتایج جستجو برای: regret minimization

تعداد نتایج: 37822 فیلتر نتایج به سال:

Strategy-Based Warm Starting for Regret Minimization in Games

2016

Noam Brown Tuomas Sandholm

Counterfactual Regret Minimization (CFR) is a popular iterative algorithm for approximating Nash equilibria in imperfect-information multi-step two-player zero-sum games. We introduce the first general, principled method for warm starting CFR. Our approach requires only a strategy for each player, and accomplishes the warm start at the cost of a single traversal of the game tree. The method pro...

متن کامل

Regret Minimization for Partially Observable Deep Reinforcement Learning

Journal: :CoRR 2017

Peter H. Jin Sergey Levine Kurt Keutzer

Deep reinforcement learning algorithms that estimate state and state-action value functions have been shown to be effective in a variety of challenging domains, including learning control strategies from raw image pixels. However, algorithms that estimate state and state-action value functions typically assume a fully observed state and must compensate for partial or non-Markovian observations ...

متن کامل

Stochastic Regret Minimization for Revenue Management Problems with Nonstationary Demands

2016

Huanan Zhang Cong Shi Chao Qin Cheng Hua

Problems with Nonstationary Demands Huanan Zhang∗, Cong Shi∗, Chao Qin†, Cheng Hua‡ ∗ Industrial and Operations Engineering, University of Michigan, Ann Arbor, MI 48109 {zhanghn, shicong}@umich.edu † Industrial Engineering and Management Sciences, Northwestern University, Evanston, IL 60208, [email protected] ‡ Yale School of Management, Yale University, New Haven, CT 06511, cheng....

متن کامل

Approximation algorithms for regret minimization in vehicle routing problems

2014

Adrian Aloysius BOCK Adrian Bock

In this thesis, we present new approximation algorithms as well as hardness of approximation results for N P-hard vehicle routing problems related to public transportation. We consider two different problem classes that also occur frequently in areas such as logistics, robotics, or distribution systems. For the first problem class, the goal is to visit as many locations in a network as possible...

متن کامل

Efficient Nash equilibrium approximation through Monte Carlo counterfactual regret minimization

2012

Michael Johanson Nolan Bard Marc Lanctot Richard G. Gibson Michael H. Bowling

Recently, there has been considerable progress towards algorithms for approximating Nash equilibrium strategies in extensive games. One such algorithm, Counterfactual Regret Minimization (CFR), has proven to be effective in two-player zero-sum poker domains. While the basic algorithm is iterative and performs a full game traversal on each iteration, sampling based approaches are possible. For i...

متن کامل

Heteroscedastic Sequences: Beyond Gaussianity

2016

Oren Anava Shie Mannor

We address the problem of sequential prediction in the heteroscedastic setting, when both the signal and its variance are assumed to depend on explanatory variables. By applying regret minimization techniques, we devise an efficient online learning algorithm for the problem, without assuming that the error terms comply with a specific distribution. We show that our algorithm can be adjusted to ...

متن کامل

Online Learning for Time Series Prediction

2013

Oren Anava Elad Hazan Shie Mannor Ohad Shamir

In this paper we address the problem of predicting a time series using the ARMA (autoregressive moving average) model, under minimal assumptions on the noise terms. Using regret minimization techniques, we develop effective online learning algorithms for the prediction problem, without assuming that the noise terms are Gaussian, identically distributed or even independent. Furthermore, we show ...

متن کامل

A minimax and asymptotically optimal algorithm for stochastic bandits

2017

Pierre Ménard Aurélien Garivier

We propose the kl-UCB algorithm for regret minimization in stochastic bandit models with exponential families of distributions. We prove that it is simultaneously asymptotically optimal (in the sense of Lai and Robbins’ lower bound) and minimax optimal. This is the first algorithm proved to enjoy these two properties at the same time. This work thus merges two different lines of research with s...

متن کامل

Application of Random Regret Minimization Model in the Context of Intercity Travel Mode Choice

Journal: :Journal of the Korean society for railway 2016

متن کامل

Park-and-ride lot choice model using random utility maximization and random regret minimization

Journal: :Transportation 2017

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید