نتایج جستجو برای: reward situation
تعداد نتایج: 163929 فیلتر نتایج به سال:
This paper presents a model of agent behavior that takes into account emotions and moral values. In our proposal, when the description of the current situation reveals that an agent’s moral value is ‘at stake’, the moral goal of reestablishing the threatened value is included among the active goals. The compliance with values generates positive emotions like pride and admiration, while the oppo...
The performance of Octopus cyanea was studied in 3 appetitive conditioning situations. In Experiment 1, 2 groups were trained in a runway; a large reward produced faster acquisition when reinforcement was consistent and better subsequent performance on a partial schedule than did a small reward. In Experiment 2, activity in the vicinity of a feeder was measured, and in Experiment 3, latency and...
How the brain uses success and failure to optimize future decisions is a long-standing question in neuroscience. One computational solution involves updating the values of context-action associations in proportion to a reward prediction error. Previous evidence suggests that such computations are expressed in the striatum and, as they are cognitively impenetrable, represent an unconscious learn...
This paper describes research investigating behavioral specialization in learning robot teams. Each agent is provided a common set of skills (motor schema-based behavioral assemblages) from which it builds a taskachieving strategy using reinforcement learning. The agents learn individually to activate particular behavioral assemblages given their current situation and a reward signal. The exper...
A promising technique used in some planning systems to improve their performance is the use of domain dependent search control knowledge. We present a procedure for compiling search control knowledge, expressed declaratively in a logic, into the preconditions of the plan actions (operators). We do this within the framework of the situation calculus by introducing a transformation of nonMarkovia...
Reasoning about perception of depth and about spatial relations between moving physical objects is a challenging problem. We investigate the representation of depth and motion by means of depth profiles whereby each object in the world is represented as a single peak. We propose a logical theory, formulated in the situation calculus (SC), that is used for reasoning about object motion (includin...
By using an example from a robot navigating domain, we argue that to specify declaratively the behavior of an agent, we need to have a formal and explicit notion of \quality plans." To that end, we propose the following three domain independent measures of plan quality: a plan is said to be A-minimalif none of the actions in it can be deleted and have it continue to be a plan; it is said to be ...
We present a declarative formalization of STRIPS [1] as a reasoning strategy in the situation calculus [10]. The idea is to use logic not only to represent planning problems, but also to describe the mental situations, mental actions and reasoning strategy STRIPS uses to solve those problems.
In this paper we study the diierences between two logic theories for temporal reasoning, the Situation Calculus and the Event Calculus, and the implications of these diierences. We construct a new formalism that combines the advantages of both Situation and Event Calculus and avoids the problems of either. The new formalism is useful for general temporal reasoning in worlds with discrete and co...
In most common settings of Markov Decision Process (MDP), an agent evaluate a policy based on expectation of (discounted) sum of rewards. However in many applications this criterion might not be suitable from two perspective: first, in risk aversion situation expectation of accumulated rewards is not robust enough, this is the case when distribution of accumulated reward is heavily skewed; anot...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید