نتایج جستجو برای: atari
تعداد نتایج: 829 فیلتر نتایج به سال:
Acorn – the Beginning The year was 1979. Atari introduced a coin-operated version of Asteroids. The programming language ADA was born. 3COM, Oracle, and Seagate were founded. TI entered the computer market. Hayes marketed its first modem, which became the industry standard for modems. The Motorola 68K and Intel 8088 were released. And Hermann Hauser and Chris Curry, with the support of a group ...
A deep learning approach to reinforcement learning led to a general learner able to train on visual input to play a variety of arcade games at the human and superhuman levels. Its creators at the Google DeepMind’s team called the approach: Deep Q-Network (DQN). We present an extension of DQN by “soft” and “hard” attention mechanisms. Tests of the proposed Deep Attention Recurrent Q-Network (DAR...
We introduce a deep, generative autoencoder capable of learning hierarchies of distributed representations from data. Successive deep stochastic hidden layers are equipped with autoregressive connections, which enable the model to be sampled from quickly and exactly via ancestral sampling. We derive an efficient approximate parameter estimation method based on the minimum description length (MD...
We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning. MAC is a policy gradient algorithm that uses the agent’s explicit representation of all action values to estimate the gradient of the policy, rather than using only the actions that were actually executed. This significantly reduces variance in the gradient updates and removes the n...
Recently, a methodology has been proposed for boosting the computational intelligence of randomized gameplaying programs. We modify this methodology by working on rectangular, rather than square, matrices; and we apply it to the Domineering game. At CIG 2015, We propose a demo in the case of Go. Hence, players on site can contribute to the scientific validation by playing (in a double blind man...
Efficient exploration in complex environments remains a major challenge for reinforcement learning. We propose bootstrapped DQN, a simple algorithm that explores in a computationally and statistically efficient manner through use of randomized value functions. Unlike dithering strategies such as -greedy exploration, bootstrapped DQN carries out temporally-extended (or deep) exploration; this ca...
Deep Reinforcement Learning (DRL) has had several breakthroughs, from helicopter controlling and Atari games to the Alpha-Go success. Despite their success, DRL still lacks several important features of human intelligence, such as transfer learning, planning and interpretability. We compare two DRL approaches at learning and generalization: Deep Q-Networks and Deep Symbolic Reinforcement Learni...
Most learning algorithms are not invariant to the scale of the signal that is being approximated. We propose to adaptively normalize the targets used in the learning updates. This is important in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when we update the policy of behavior. Our main motivation is prior work on learning to ...
We describe an iterative procedure for optimizing policies, with guaranteed monotonic improvement. By making several approximations to the theoretically-justified procedure, we develop a practical algorithm, called Trust Region Policy Optimization (TRPO). This algorithm is similar to natural policy gradient methods and is effective for optimizing large nonlinear policies such as neural networks...
We have developed a chemo-dynamical approach to assign 36,010 metal-poor SkyMapper stars various Galactic stellar populations. Using two independent techniques (velocity and action space behavior), $Gaia$ EDR3 astrometry, photometric metallicities, we selected with the characteristics of "metal-weak" thick disk population by minimizing contamination canonical or other structures. This sample co...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید