نتایج جستجو برای: atari
تعداد نتایج: 829 فیلتر نتایج به سال:
Some of the cultural and technical forces that influenced the creation of the “man” (the player-controlled element) in two early home video games, Pitfall! and Yars’ Revenge, are discussed. We find that the specific nature of the Atari Video Computer System (also known as the Atari VCS and Atari 2600) as a computing platform enables and constrains what can be done on the system, and that it als...
We experimented a simple yet powerful optimization for Monte-Carlo Go tree search. It consists in dealing appropriately with strings that have two liberties. The heuristic is contained in one page of code and the Go program that uses it improves from 50 % of won games against Gnugo 3.6 to 76 % of won games.
Teaching computers to play video games is a complex learning problem that has recently seen increased attention. In this paper, we develop a system that, using constant model and hyperparameter settings, learns to play a variety of Atari games. In order to accomplish this task, we extract object features from the game screen, and provide these features as input into reinforcement learning algor...
Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowledge back over time-steps in a single update. We investigate the use of eligibility traces in combination with recurrent networks in the Atari domain. We illustrate the benefits of both recurrent nets and eligibility traces in some Atari games, and highligh...
Similarly to the classical AI planning, the Atari 2600 games supported in the Arcade Learning Environment all feature a fully observable (RAM) state and actions that have deterministic effect. At the same time, the problems in ALE are given only implicitly, via a simulator, a priori precluding exploiting most of the modern classical planning techniques. Despite that, Lipovetzky et al. [2015] re...
We introduce a novel type of actor-critic approach for deep reinforcement learning which is based on learning vector quantization. We replace the softmax operator of the policy with a more general and more flexible operator that is similar to the robust soft learning vector quantization algorithm. We compare our approach to the default A3C architecture on three Atari 2600 games and a simplistic...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید