نتایج جستجو برای: regret minimization
تعداد نتایج: 37822 فیلتر نتایج به سال:
The dueling bandit is a learning framework wherein the feedback information in the learning process is restricted to a noisy comparison between a pair of actions. In this research, we address a dueling bandit problem based on a cost function over a continuous space. We propose a stochastic mirror descent algorithm and show that the algorithm achieves an O( √ T log T )-regret bound under strong ...
We study the link between the game theoretic notion of ’regret minimization’ and robust option pricing. We demonstrate how trading strategies that minimize regret also imply robust upper bounds for the prices of European call options. These bounds are based on ’no-arbitrage’ and are robust in that they require only minimal assumptions regarding the stock price process. We then focus on the opti...
Cepheus is the first computer program to essentially solve a game of imperfect information that is played competitively by humans. The game it plays is heads-up limit Texas hold’em poker, a game with over 10 information sets, and a challenge problem for artificial intelligence for over 10 years. Cepheus was trained using a new variant of Counterfactual Regret Minimization (CFR), called CFR, usi...
Decision making is a fundamental building block of people’s lives. Each decision requires expenditure of cognitive effort, though to a varying degree, which is considered a valuable yet limited resource in the decision making literature. Though the importance of a cognitive effort minimization goal is well-established in the marketing literature, this paper examined how cognitive effort exertio...
In exploring representative databases, a primary issue has been finding accurate models of user preferences. Given this, our work generalizes the method of regret minimization as proposed by Nanongkai et al. to include nonlinear utility functions. Regret minimization is an approach for selecting k representative points from a database such that every user’s ideal point in the entire database is...
This paper presents a sample-based algorithm for the computation of restricted Nash strategies in complex extensive form games. Recent work indicates that regret-minimization algorithms using selective sampling, such as Monte-Carlo Counterfactual Regret Minimization (MCCFR), converge faster to Nash equilibrium (NE) strategies than their non-sampled counterparts which perform a full tree travers...
We study revenue optimization learning algorithms for posted-price auctions with strategic buyers. We analyze a very broad family of monotone regret minimization algorithms for this problem, which includes the previously best known algorithm, and show that no algorithm in that family admits a strategic regret more favorable than Ω( √ T ). We then introduce a new algorithm that achieves a strate...
We demonstrate that, in the classical non-stochastic regret minimization problem with d decisions, gains and losses to be respectively maximized or minimized are fundamentally different. Indeed, by considering the additional sparsity assumption (at each stage, at most s decisions incur a nonzero outcome), we derive optimal regret bounds of different orders. Specifically, with gains, we obtain a...
Decision making is a fundamental building block of people’s lives. Each decision requires expenditure of cognitive effort, though to a varying degree, which is considered a valuable yet limited resource in the decision making literature. Though the importance of a cognitive effort minimization goal is well-established in the marketing literature, this paper examined how cognitive effort exertio...
Slumbot NL is a heads-up no-limit hold’em poker bot built with a distributed disk-based implementation of counterfactual regret minimization (CFR). Our implementation enables us to solve a large abstraction on commodity hardware in a cost-effective fashion. A variant of the Public Chance Sampling (PCS) version of CFR is employed which works particularly well with
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید