نتایج جستجو برای: reward situation

تعداد نتایج: 163929  

2013
Cristina Battaglino Rossana Damiano Leonardo Lesmo

This paper presents a model of agent behavior that takes into account emotions and moral values. In our proposal, when the description of the current situation reveals that an agent’s moral value is ‘at stake’, the moral goal of reestablishing the threatened value is included among the active goals. The compliance with values generates positive emotions like pride and admiration, while the oppo...

Journal: :Journal of comparative psychology 1991
M R Papini M E Bitterman

The performance of Octopus cyanea was studied in 3 appetitive conditioning situations. In Experiment 1, 2 groups were trained in a runway; a large reward produced faster acquisition when reinforcement was consistent and better subsequent performance on a partial schedule than did a small reward. In Experiment 2, activity in the vicinity of a feeder was measured, and in Experiment 3, latency and...

2008
Mathias Pessiglione Predrag Petrovic Jean Daunizeau Stefano Palminteri Raymond J. Dolan Chris D. Frith

How the brain uses success and failure to optimize future decisions is a long-standing question in neuroscience. One computational solution involves updating the values of context-action associations in proportion to a reward prediction error. Previous evidence suggests that such computations are expressed in the striatum and, as they are cognitively impenetrable, represent an unconscious learn...

1997
Tucker Balch

This paper describes research investigating behavioral specialization in learning robot teams. Each agent is provided a common set of skills (motor schema-based behavioral assemblages) from which it builds a taskachieving strategy using reinforcement learning. The agents learn individually to activate particular behavioral assemblages given their current situation and a reward signal. The exper...

2003
Alfredo Gabaldon

A promising technique used in some planning systems to improve their performance is the use of domain dependent search control knowledge. We present a procedure for compiling search control knowledge, expressed declaratively in a logic, into the preconditions of the plan actions (operators). We do this within the framework of the situation calculus by introducing a transformation of nonMarkovia...

2008
Mikhail Soutchanski Paulo Santos

Reasoning about perception of depth and about spatial relations between moving physical objects is a challenging problem. We investigate the representation of depth and motion by means of depth profiles whereby each object in the world is represented as a single peak. We propose a logical theory, formulated in the situation calculus (SC), that is used for reasoning about object motion (includin...

1998
Fangzhen Lin

By using an example from a robot navigating domain, we argue that to specify declaratively the behavior of an agent, we need to have a formal and explicit notion of \quality plans." To that end, we propose the following three domain independent measures of plan quality: a plan is said to be A-minimalif none of the actions in it can be deleted and have it continue to be a plan; it is said to be ...

1998
Josefina Sierra-Santibáñez

We present a declarative formalization of STRIPS [1] as a reasoning strategy in the situation calculus [10]. The idea is to use logic not only to represent planning problems, but also to describe the mental situations, mental actions and reasoning strategy STRIPS uses to solve those problems.

1995
Kristof Van Belleghem Marc Denecker Danny De Schreye

In this paper we study the diierences between two logic theories for temporal reasoning, the Situation Calculus and the Event Calculus, and the implications of these diierences. We construct a new formalism that combines the advantages of both Situation and Event Calculus and avoids the problems of either. The new formalism is useful for general temporal reasoning in worlds with discrete and co...

Journal: :CoRR 2017
Yan Li Zhaohan Sun

In most common settings of Markov Decision Process (MDP), an agent evaluate a policy based on expectation of (discounted) sum of rewards. However in many applications this criterion might not be suitable from two perspective: first, in risk aversion situation expectation of accumulated rewards is not robust enough, this is the case when distribution of accumulated reward is heavily skewed; anot...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید