نتایج جستجو برای: atari
تعداد نتایج: 829 فیلتر نتایج به سال:
AI systems are increasingly applied to complex tasks that involve interaction with humans. During training, such systems are potentially dangerous, as they haven’t yet learned to avoid actions that could cause serious harm. How can an AI system explore and learn without making a single mistake that harms humans or otherwise causes serious damage? For model-free reinforcement learning, having a ...
Evolution Strategies (ES) have recently been demonstrated to be a viable alternative to reinforcement learning (RL) algorithms on a set of challenging deep RL problems, including Atari games and MuJoCo humanoid locomotion benchmarks. While the ES algorithms in that work belonged to the specialized class of natural evolution strategies (which resemble approximate gradient RL algorithms, such as ...
Organizing code into coherent programs and relating different programs to each other represents an underlying requirement for scaling genetic programming to more difficult task domains. Assuming a model in which policies are defined by teams of programs, in which team and program are represented using independent populations and coevolved, has previously been shown to support the development of...
Different reinforcement learning (RL) methods exist to address the problem of combining multiple different learners generate a superior learner. These existing usually assume that each learner uses same algorithm and/or state representation. We propose an ensemble combines set base and leverages strengths online. demonstrate proposed learner’s ability combine adapt changes in performance on var...
In recent years, neural networks have enjoyed a renaissance as function approximators in reinforcement learning. Two decades after Tesauro's TD-Gammon achieved near top-level human performance in backgammon, the deep reinforcement learning algorithm DQN achieved human-level performance in many Atari 2600 games. The purpose of this study is twofold. First, we propose two activation functions for...
The Atari 2600 games supported in the Arcade Learning Environment (Bellemare et al. 2013) all feature a known initial (RAM) state and actions that have deterministic effects. Classical planners, however, cannot be used for selecting actions for two reasons: first, no compact PDDL-model of the games is given, and more importantly, the action effects and goals are not known a priori. Moreover, in...
We consider an agent’s uncertainty about its environment and the problem of generalizing this uncertainty across states. Specifically, we focus on the problem of exploration in non-tabular reinforcement learning. Drawing inspiration from the intrinsic motivation literature, we use density models to measure uncertainty, and propose a novel algorithm for deriving a pseudo-count from an arbitrary ...
The inclusive cross section for the production of charmed D mesons in twophoton processes is measured with the AMY detector at the TRISTAN e + e collider. D mesons are identi ed from the distribution of charged-particle transverse momenta relative to the jet axis. A data sample corresponding to an integrated luminosity of 176 pb 1 at a center-of-mass energy of 58 GeV is used to determine a cros...
1 Raymond and Beverly Sackler Faculty of Exact Sciences, School of Physics and Astronomy, Tel-Aviv University, Tel-Aviv, Israel, Department of Astronomy and Astrophysics, Pennsylvania State University, University Park, PA, United States, 3 Institute for Gravitation and the Cosmos, Pennsylvania State University, University Park, PA, United States, Department of Physics, Pennsylvania State Univer...
∗Intelligent Systems Research Institute, National Institute of Advanced Industrial Science and Technology (AIST) Tsukuba Central 2, 1-1-1 Umezono, Tsukuba-shi, Ibaraki 305-8568, Japan E-mail: [email protected] ∗∗Technologic Arts Inc., Cosmos-Hongo 9F, 4-1-4 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan ∗∗∗System Engineering Consultants Co., Ltd., Setagaya Business Square, 4-10-1 Yoga, Setagaya-ku, To...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید