atari

An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients

Journal: :CoRR 2018

Jiaming Song Yuhuai Wu

Deep reinforcement learning methods have shown tremendous success in a large variety tasks, such as Go [Silver et al., 2016], Atari [Mnih et al., 2013], and continuous control [Lillicrap et al., 2015, Schulman et al., 2015]. Policy gradient methods [Williams, 1992] is an important family of methods in model-free reinforcement learning, and the current state-of-the-art policy gradient methods ar...

متن کامل

Initial Progress in Transfer for Deep Reinforcement Learning Algorithms

2016

Yunshu Du Gabriel V. de la Cruz James Irwin Matthew E. Taylor

As one of the first successful models that combines reinforcement learning technique with deep neural networks, the Deep Q-network (DQN) algorithm has gained attention as it bridges the gap between high-dimensional sensor inputs and autonomous agent learning. However, one main drawback of DQN is the long training time required to train a single task. This work aims to leverage transfer learning...

متن کامل

Understanding Visual Concepts with Continuation Learning

Journal: :CoRR 2016

William F. Whitney Michael Chang Tejas D. Kulkarni Joshua B. Tenenbaum

We introduce a neural network architecture and a learning algorithm to produce factorized symbolic representations. We propose to learn these concepts by observing consecutive frames, letting all the components of the hidden representation except a small discrete set (gating units) be predicted from the previous frame, and let the factors of variation in the next frame be represented entirely b...

متن کامل

Progressive Neural Networks

Journal: :CoRR 2016

Andrei A. Rusu Neil C. Rabinowitz Guillaume Desjardins Hubert Soyer James Kirkpatrick Koray Kavukcuoglu Razvan Pascanu Raia Hadsell

Learning to solve complex sequences of tasks—while both leveraging transfer and avoiding catastrophic forgetting—remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this archite...

متن کامل

Loop Quantum Gravity: Lee Smolin [2

2011

SMOLIN LEE SMOLIN

LEE SMOLIN, a theoretical physicist, is concerned with quantum gravity,"the name we give to the theory that unifies all the physics now under construction." More specifically, he is a co-inventor of an approach called loop quantum gravity. In 2001, he became a founding member and research physicist of the Perimeter Institute for Theoretical Physics, in Waterloo, Ontario. Smolin is the author of...

متن کامل

UCB-NE-5008 Reduction of TRU Toxicity in LWR-Spent Fuel by Reference ATW System with LBE-Cooled Transmuters

2006

M. Cheon J. Ahn E. Greenspan P. L. Chambré

متن کامل

Supplementary Material: Action-Conditional Video Prediction using Deep Networks in Atari Games

2015

Junhyuk Oh Xiaoxiao Guo Honglak Lee Richard Lewis Satinder Singh

The network architectures of the proposed models and the baselines are illustrated in Figure 1. The weight of LSTM is initialized from a uniform distribution of [−0.08, 0.08]. The weight of the fully-connected layer from the encoded feature to the factored layer and from the action to the factored layer are initialized from a uniform distribution of [−1, 1] and [−0.1, 0.1] respectively.

متن کامل

ar X iv : a st ro - p h / 99 02 26 6 v 1 1 8 Fe b 19 99 High Energy Cosmic Rays from Neutrinos

1999

J. J. Blanco - Pillado R. A. Vázquez E. Zas

We discuss recent models in which neutrinos, which are assumed to have mass in the eV range, originate the highest energy cosmic rays by interaction with the enhanced density in the galactic halo of the relic cosmic neutrino background. We make an analytical calculation of the required neutrino fluxes

متن کامل

Participatory development of a strategic product portfolio in a telecommunication company

Journal: :IJTM 2008

Mats R. K. Lindstedt Juuso Liesiö Ahti Salo

The development of a product portfolio is a strategic decision which is often complicated by the large number of competing products, product interactions and high uncertainties about how successful the products will be in the marketplace. These decisions are commonly supported either by financially oriented approaches (e.g., net present value) or more qualitative approaches (e.g., scoring model...

متن کامل

Derivative Securities , Fall 2010

2010

Jonathan Goodman

The dynamic replication strategy of Black and Scholes is important enough that it is worth repeating from last week. Recall the setup. From day k − 1 to day k, the stock (risky asset price) either goes up Sk−1 → Sk = uSk or goes down Sk = dSk−1 (recall that we actually did not necessarily need u > 1 or d < 1, but it is convenient to think of u as up and d as down.) The replicating portfolio is ...

متن کامل