نتایج جستجو برای: atari

تعداد نتایج: 829  

2005
Hervé Crès Mich Tvede

A general equilibrium model with uncertainty and production externalities is studied. In absence of markets for externalities, we look for governances and conditions under which majority voting among shareholders is likely to give rise to efficient internalization. Two observations leed the analysis: On the one hand, the shareholders with the right incentives for efficient internalization are t...

2008
Tamar Frankel

The interests of advisers and their clients may conflict in unexpected ways. One such situation arises when the adviser’s partners or managers (portfolio managers) sign a non-compete agreement with the adviser and later, when they leave, are sought after by clients who wish to continue the relationship. The case is clear if the departing portfolio manager solicits the clients of the adviser in ...

Journal: :J. Artif. Intell. Res. 2014
Alfonso Gerevini Alessandro Saetti Mauro Vallati

In the field of domain-independent planning, several powerful planners implementing different techniques have been developed. However, no one of these systems outperforms all others in every known benchmark domain. In this work, we propose a multi-planner approach that automatically configures a portfolio of planning techniques for each given domain. The configuration process for a given domain...

2006
Curt Burmeister Helmut Mausser Rafael Mendoza

Managing tracking error on an ex ante basis requires an ability to assess the possible effects of trades on a fund’s performance relative to its benchmark. Given a trading strategy, its potential for reducing tracking error must be balanced against trading costs and return expectations. This chapter presents several simple diagnostic tools to help fund managers evaluate alternative trading stra...

2013
Alejandro Corichi Asieh Karami

Alejandro Corichi 2, ∗ and Asieh Karami 1, † Centro de Ciencias Matemáticas, Universidad Nacional Autónoma de México, UNAM-Campus Morelia, A. Postal 61-3, Morelia, Michoacán 58090, Mexico Center for Fundamental Theory, Institute for Gravitation and the Cosmos, Pennsylvania State University, University Park PA 16802, USA Instituto de F́ısica y Matemáticas, Universidad Michoacana de San Nicolás de...

2002
Harry A. Schmitz

A fractal model of the cosmos is presented in terms of distinct orders of universes, particles, substrates and strata. Each universe in the fractal cosmos is characterized by the radius of that universe divided by the effective radius of one of its stratum particles. It is shown that this size ratio increases rapidly for higher order universes and that a series of universes of descending order ...

2017
Ruohan Zhang Zhuode Liu Mary M. Hayhoe Dana H. Ballard

When a learning agent attempts to imitate human visuomotor behaviors, it may benefit from knowing the human demonstrator’s visual attention. Such information could clarify the goal of the demonstrator, i.e., the object being attended is the most likely target of the current action. Hence it could help the agent better infer and learn the demonstrator’s underlying state representation for decisi...

Journal: :CoRR 2016
Hado van Hasselt Arthur Guez Matteo Hessel David Silver

Most learning algorithms are not invariant to the scale of the function that is being approximated. We propose to adaptively normalize the targets used in learning. This is useful in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when we update the policy of behavior. Our main motivation is prior work on learning to play Atari ga...

Journal: :CoRR 2017
Felix Leibfried Jordi Grau-Moya Haitham Bou-Ammar

We methodologically address the problem of Qvalue overestimation in deep reinforcement learning to handle high-dimensional state spaces efficiently. By adapting concepts from information theory, we introduce an intrinsic penalty signal encouraging reduced Q-value estimates. The resultant algorithm encompasses a wide range of learning outcomes containing deep Q-networks as a special case. Differ...

2017
Oron Anschel Nir Baram Nahum Shimkin

The commonly used Q-learning algorithm combined with function approximation induces systematic overestimations of state-action values. These systematic errors might cause instability, poor performance and sometimes divergence of learning. In this work, we present the AVERAGED TARGET DQN (ADQN) algorithm, an adaptation to the DQN class of algorithms which uses a weighted average over past learne...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید