نتایج جستجو برای: regret minimization

تعداد نتایج: 37822  

2017
Wataru Kumagai

The dueling bandit is a learning framework wherein the feedback information in the learning process is restricted to a noisy comparison between a pair of actions. In this research, we address a dueling bandit problem based on a cost function over a continuous space. We propose a stochastic mirror descent algorithm and show that the algorithm achieves an O( √ T log T )-regret bound under strong ...

2005
Peter M. DeMarzo Ilan Kremer Yishay Mansour

We study the link between the game theoretic notion of ’regret minimization’ and robust option pricing. We demonstrate how trading strategies that minimize regret also imply robust upper bounds for the prices of European call options. These bounds are based on ’no-arbitrage’ and are robust in that they require only minimal assumptions regarding the stock price process. We then focus on the opti...

2015
Oskari Tammelin Neil Burch Michael Johanson Michael H. Bowling

Cepheus is the first computer program to essentially solve a game of imperfect information that is played competitively by humans. The game it plays is heads-up limit Texas hold’em poker, a game with over 10 information sets, and a challenge problem for artificial intelligence for over 10 years. Cepheus was trained using a new variant of Counterfactual Regret Minimization (CFR), called CFR, usi...

2015
J. April Park W. Trey Hill Jennifer M. Bonds-Raacke

Decision making is a fundamental building block of people’s lives. Each decision requires expenditure of cognitive effort, though to a varying degree, which is considered a valuable yet limited resource in the decision making literature. Though the importance of a cognitive effort minimization goal is well-established in the marketing literature, this paper examined how cognitive effort exertio...

Journal: :PVLDB 2015
Taylor Kessler Faulkner Will Brackenbury Ashwin Lall

In exploring representative databases, a primary issue has been finding accurate models of user preferences. Given this, our work generalizes the method of regret minimization as proposed by Nanongkai et al. to include nonlinear utility functions. Regret minimization is an approach for selecting k representative points from a database such that every user’s ideal point in the entire database is...

2010
Marc J. V. Ponsen Marc Lanctot Steven de Jong

This paper presents a sample-based algorithm for the computation of restricted Nash strategies in complex extensive form games. Recent work indicates that regret-minimization algorithms using selective sampling, such as Monte-Carlo Counterfactual Regret Minimization (MCCFR), converge faster to Nash equilibrium (NE) strategies than their non-sampled counterparts which perform a full tree travers...

2014
Mehryar Mohri Andres Muñoz Medina

We study revenue optimization learning algorithms for posted-price auctions with strategic buyers. We analyze a very broad family of monotone regret minimization algorithms for this problem, which includes the previously best known algorithm, and show that no algorithm in that family admits a strategic regret more favorable than Ω( √ T ). We then introduce a new algorithm that achieves a strate...

Journal: :Journal of Machine Learning Research 2016
Joon Kwon Vianney Perchet

We demonstrate that, in the classical non-stochastic regret minimization problem with d decisions, gains and losses to be respectively maximized or minimized are fundamentally different. Indeed, by considering the additional sparsity assumption (at each stage, at most s decisions incur a nonzero outcome), we derive optimal regret bounds of different orders. Specifically, with gains, we obtain a...

Journal: :Computers in Human Behavior 2015
Jisook Park W. Trey Hill Jennifer Bonds-Raacke

Decision making is a fundamental building block of people’s lives. Each decision requires expenditure of cognitive effort, though to a varying degree, which is considered a valuable yet limited resource in the decision making literature. Though the importance of a cognitive effort minimization goal is well-established in the marketing literature, this paper examined how cognitive effort exertio...

2013
Eric Jackson

Slumbot NL is a heads-up no-limit hold’em poker bot built with a distributed disk-based implementation of counterfactual regret minimization (CFR). Our implementation enables us to solve a large abstraction on commodity hardware in a cost-effective fashion. A variant of the Public Chance Sampling (PCS) version of CFR is employed which works particularly well with

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید