نتایج جستجو برای: q learning

تعداد نتایج: 717428  

2005
Dorian Suc Ivan Bratko

Usual numerical learning methods are primarily concerned with finding a good numerical fit to data and often make predictions that do not correspond to qualitative laws in the domain of modelling or expert intuition. In contrast, the idea of Q learning is to induce qualitative constraints from training data, and use the constraints to guide numerical regression. The resulting numerical predicti...

2011
Nataliya Sokolovska Olivier Teytaud Mario Milone

Discretization of state and action spaces is a critical issue in Q-Learning. In our contribution, we propose a real-time adaptation of the discretization by the progressive widening technique which has been already used in bandit-based methods. Results are consistently converging to the optimum of the problem, without changing the parametrization for each new problem.

Journal: :J. Artificial Societies and Social Simulation 2008
Keiki Takadama Tetsuro Kawai Yuhsuke Koyama

This paper addresses both microand macro-level validation in agent-based simulation (ABS) to explore validated agents who can reproduce not only human-like behaviors externally but also human-like thinking internally. For this purpose, we employ the sequential bargaining game, which can investigate a change of humans’ behaviors and thinking longer than the ultimate game (i.e., one time bargaini...

1994
Robert H. Crites Andrew G. Barto

We prove the convergence of an actor/critic algorithm that is equivalent to Q-learning by construction. Its equivalence is achieved by encoding Q-values within the policy and value function of the actor and critic. The resultant actor/critic algorithm is novel in two ways: it updates the critic only when the most probable action is executed from any given state, and it rewards the actor using c...

2017
P. A. van der Linden

In the current thesis, the extent to which the A3C architecture is able to learn higher-order control tasks is studied. The tasks consisted of a set of subtasks of the simplified Space Fortress game which vary in complexity and order of control required. Experiments in previous attempts with Deep Q-learning applications on these subtasks have shown substantial learning in the lower-order tasks,...

Journal: :IEEE Control Systems Letters 2020

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2021

The Q-learning algorithm is known to be affected by the maximization bias, i.e. systematic overestimation of action values, an important issue that has recently received renewed attention. Double been proposed as efficient mitigate this bias. However, comes at price underestimation in addition increased memory requirements and a slower convergence. In paper, we introduce new way address bias fo...

1999
Ferenc Beleznay

We compare scaling properties of several value-function estimation algorithms. In particular, we prove that Q-learning can scale exponentially slowly with the number of states. We identify the reasons of the slow convergence and show that both TD( ) and learning with a xed learning-rate enjoy rather fast convergence, just like the model-based method.

2014
Jason Cullimore Howard Hamilton David Gerhard

One challenge relating to the creation of adaptive music involves generating transitions between musical ideas. This paper proposes a solution to this problem based on a modification of the Q-Learning framework described by Reese, Yampolskiy and Elmaghraby. The proposed solution represents chords as states in a domain and generates a transition between any two major or minor chords by finding a...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید