نتایج جستجو برای: return policy

تعداد نتایج: 335824  

2013
JW Keep Q Adelasoye JR Pallett S Calvert

Objectives The National Institute for Health and Clinical Excellence (NICE) recommend that after out of hospital cardiac arrest (OHCA) in patients with return of spontaneous circulation (ROSC), therapeutic hypothermia is induced “as soon as possible” to maintain core body temperature at 32-34°C for 12-24 hours.[1] Surface or internal cooling techniques are technically challenging in the Emergen...

Journal: :CoRR 2017
Chris J. Maddison Dieterich Lawson George Tucker Nicolas Heess Arnaud Doucet Andriy Mnih Yee Whye Teh

The policy gradients of the expected return objective can react slowly to rare rewards. Yet, in some cases agents may wish to emphasize the low or high returns regardless of their probability. Borrowing from the economics and control literature, we review the risk-sensitive value function that arises from an exponential utility and illustrate its effects on an example. This risk-sensitive value...

Journal: :J. Informetrics 2008
Jeppe Nicolaisen Tove Faber Frandsen

The paper introduces a new journal impact measure called The Reference Return Ratio (3R). Unlike the traditional Journal Impact Factor (JIF), which is based on calculations of publications and citations, the new measure is based on calculations of bibliographic investments (references) and returns (citations). A comparative study of the two measures shows a strong relationship between the 3R an...

2000
Martin B Haugh Andrew W Lo

The fact that derivative securities are equivalent to specific dynamic trading strategies in complete markets suggests the possibility of constructing buy-and-hold portfolios of options that mimic certain dynamic investment policies, e.g. asset-allocation rules. We explore this possibility by solving the following problem: given an optimal dynamic investment policy, find a set of options at the...

2014
Mayank Daswani Peter Sunehag Marcus Hutter

The problem we consider in this paper is reinforcement learning with value advice. In this setting, the agent is given limited access to an oracle that can tell it the expected return (value) of any state-action pair with respect to the optimal policy. The agent must use this value to learn an explicit policy that performs well in the environment. We provide an algorithm called RLAdvice, based ...

2009
Yiting Li Guillaume Rocheteau

We study economies where some assets play an essential role to …nance consumption opportunities but payment arrangements are subject to a moral hazard problem. Agents can produce fraudulent assets at a positive cost, which generates an endogenous upper bound on the quantity of assets that can be exchanged for goods and services. This endogenous liquidity constraint depends on the characteristic...

2010
Zongwu Cai Linna Chen Ying Fang

This paper models the return series of USD/CNY exchange rate by considering the conditional mean and conditional volatility simultaneously. An index type functional-coefficient model is adopted to model the conditional mean part and a GARCH type model with a policy dummy variable is applied to the conditional volatility model. We show that the government policy indeed has an impact on the excha...

1996
Edwin Mansfield

My assigned topic is the question: Can policymakers spur or deter technological change? The question is to be addressed from a micro perspective, by examining policies regarding research and development (R&D), patents, and competition. Since there is no point in keeping the reader in suspense, I shall argue that government policy plays a major role in influencing the rate of technological chang...

2016
Haris Aziz Thomas Kalinowski Toby Walsh Lirong Xia

Sequential allocation is a simple and attractive mechanism for the allocation of indivisible goods used in a number of real world settings. In sequential allocation, agents pick items according to a policy, the order in which agents take turns. Sequential allocation will return an allocation which is Pareto efficient – no agent can do better without others doing worse. However, sequential alloc...

2000
Malcolm J. A. Strens

The reinforcement learning problem can be decomposed into two parallel types of inference: (i) estimating the parameters of a model for the underlying process; (ii) determining behavior which maximizes return under the estimated model. Following Dearden, Friedman and Andre (1999), it is proposed that the learning process estimates online the full posterior distribution over models. To determine...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید