نتایج جستجو برای: reward processes

تعداد نتایج: 554393  

2011
Kevin Regan Craig Boutilier

Specifying the reward function of a Markov decision process (MDP) can be demanding, requiring human assessment of the precise quality of, and tradeoffs among, various states and actions. However, reward functions often possess considerable structure which can be leveraged to streamline their specification. We develop new, decisiontheoretically sound heuristics for eliciting rewards for factored...

2017
Scott A. Schelp Katherine J. Pultorak Dylan R. Rakowski Devan M. Gomez Gregory Krzystyniak Raibatak Das Erik B. Oleson

The mesolimbic dopamine system is strongly implicated in motivational processes. Currently accepted theories suggest that transient mesolimbic dopamine release events energize reward seeking and encode reward value. During the pursuit of reward, critical associations are formed between the reward and cues that predict its availability. Conditioned by these experiences, dopamine neurons begin to...

Journal: :Vision Research 2015
Elisa Infanti Clayton Hickey Massimo Turatto

Reward plays a fundamental role in human behavior. A growing number of studies have shown that stimuli associated with reward become salient and attract attention. The aim of the present study was to extend these results into the investigation of iconic memory and visual working memory. In two experiments we asked participants to perform a visual-search task where different colors of the target...

Journal: :Psychological review 2007
A David Redish Steve Jensen Adam Johnson Zeb Kurth-Nelson

Because learned associations are quickly renewed following extinction, the extinction process must include processes other than unlearning. However, reinforcement learning models, such as the temporal difference reinforcement learning (TDRL) model, treat extinction as an unlearning of associated value and are thus unable to capture renewal. TDRL models are based on the hypothesis that dopamine ...

Journal: :Biological psychology 2013
Daniel Vega Àngel Soto Julià L Amengual Joan Ribas Rafael Torrubia Antoni Rodríguez-Fornells Josep Marco-Pallarés

Borderline Personality Disorder (BPD) patients present profound disturbances in affect regulation and impulse control which could reflect a dysfunction in reward-related processes. The current study investigated these processes in a sample of 18 BPD patients and 18 matched healthy controls, using an event-related brain potentials methodology. Results revealed a reduction in the amplitude of the...

1990
GIANFRANCO CIARDO KISHOR S. TRIVEDI

With the increasing complexity of multiprocessor and distributed processing systems, the need to develop efficient and accurate modeling methods is evident. Fault tolerance and degradable performance of such systems has given rise to considerable interest in models for the combined evaluation of performance and reliability [l], [2]. Markov or semi-Markov reward models can be used to evaluate th...

Journal: :CoRR 2015
Reinaldo Uribe Fernando Lozano Charles Anderson

This paper describes a novel method to solve average-reward semi-Markov decision processes, by reducing them to a minimal sequence of cumulative reward problems. The usual solution methods for this type of problems update the gain (optimal average reward) immediately after observing the result of taking an action. The alternative introduced, optimal nudging, relies instead on setting the gain t...

2009
Estela Camara Antoni Rodriguez-Fornells Zheng Ye Thomas F. Münte

An assortment of human behaviors is thought to be driven by rewards including reinforcement learning, novelty processing, learning, decision making, economic choice, incentive motivation, and addiction. In each case the ventral tegmental area/ventral striatum (nucleus accumbens) (VTA-VS) system has been implicated as a key structure by functional imaging studies, mostly on the basis of standard...

Journal: :Automatica 2010
Rahul Jain Pravin Varaiya

We generalize and build on the PAC Learning framework for Markov Decision Processes developed in Jain and Varaiya (2006). We consider the reward function to depend on both the state and the action. Both the state and action spaces can potentially be countably infinite. We obtain an estimate for the value function of a Markov decision process, which assigns to each policy its expected discounted...

1999
Omid Madani Steve Hanks Anne Condon

We investigate the computability of problems in probabilistic planning and partially observable innnite-horizon Markov decision processes. The undecidability of the string-existence problem for probabilistic nite automata is adapted to show that the following problem of plan existence in probabilistic planning is undecidable: given a probabilistic planning problem, determine whether there exist...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید