reward penalty scheme

نتایج جستجو برای: reward penalty scheme

تعداد نتایج: 265788 فیلتر نتایج به سال:

Strategic Disclosure of Valuable Information within Competitive Environments

2008

Young-Ro Yoon

Can valuable information be disclosed intentionally by the informed agent even within a competitive environment? In this article, we bring our interest into the asymmetry in reward and penalty in the payoff structure and explore its effects on the strategic disclosure of valuable information. According to our results, the asymmetry in reward and penalty is a necessary condition for the disclosu...

متن کامل

Adaptive critic for sigma-pi networks

Journal: :Neural Networks 1996

Richard Stuart Neville T. John Stonham

-This article presents an investigation which studied how training o f sigma-pi networks with the associative reward-penalty ( A R-p ) regime may be enhanced by using two networks in parallel. The technique uses what has been termed an unsupervised "'adaptive critic element" (ACE) to give critical advice to the supervised sigma-pi network. We utilise the conventions that the sigma-pi neuron mod...

متن کامل

Effects of reward contingencies on brain activation during feedback processing

2014

Yi Jiang Sung-il Kim Mimi Bong

This study investigates differential neural activation patterns in response to reward-related feedback depending on various reward contingencies. Three types of reward contingencies were compared: a "gain" contingency (a monetary reward for correct answer/no monetary penalty for incorrect answer); a "lose" contingency (no monetary reward for correct answer/a monetary penalty for incorrect answe...

متن کامل

A Comparison of Continuous and Discretized Pursuit Learning Schemes

2000

B. John Oommen Mariana Agache

A Learning Automaton is an automaton that interacts with a random environment, having as its goal the task of learning the optimal action based on its acquired experience. Many learning automata have been proposed, with the class of Estimator Algorithms being among the fastest ones. Thathachar and Sastry [23], through the Pursuit Algorithm, introduced the concept of learning algorithms that pur...

متن کامل

Visual estimation under risk.

Journal: :Journal of vision 2007

Michael S Landy Ross Goutcher Julia Trommershäuser Pascal Mamassian

We investigate whether observers take into account their visual uncertainty in an optimal manner in a perceptual estimation task with explicit rewards and penalties for performance. Observers judged the mean orientation of a briefly presented texture consisting of a collection of line segments. The mean and, in some experiments, the variance of the distribution of line orientations changed from...

متن کامل

Credit Systems for Bycatch and Biodiversity Conservation

Journal: :Frontiers in Marine Science 2021

Credit systems for mitigation of bycatch and habitat impact, incentive-based approaches, incentivize changes in fishery operator behavior decision-making allow flexibility a least-cost method. Three types credit systems, originally developed to address environmental pollution, are presented evaluated as currently underutilized approaches. The first, cap-and-trade approach, evolved out direct re...

متن کامل

Afrl-rx-wp-tp-2009-4129 Dynamic Channel Allocation in Wireless Networks Using Learning Automata (preprint)

2009

Behdis Eslamnour Maciej Zawodniok

Single channel based wireless networks have limited bandwidth and throughput and the bandwidth utilization decreases due to congestion and interference from other sources. In order to increase the throughput, transmission in multiple channels is considered as an option. In this paper, we propose a distributed dynamic channel allocation scheme using adaptive learning automata for wireless networ...

متن کامل

Reward-penalty Mechanism for Reverse Supply Chain Network with Asymmetric Information and Carbon Emission Constraints

2017

Xiao-qing Zhang Xi-gang Yuan

and Carbon Emission Constraints Xiao-qing Zhang Xi-gang Yuan (1. The School of Statistics Southwestern University of Finance and Economics Chengdu 611130, P.R. China (2. The School of Statistics Southwestern University of Finance and Economics Chengdu 611130, P.R. China) Abstract: In this paper, we discuss the government’s reward and penalty mechanism in the presence of asymmetric information a...

متن کامل

The STAR automaton: expediency and optimality properties

Journal: :IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society 2002

Anastasios A. Economides Athanasios Kehagias

We present the STack ARchitecture (STAR) automaton. It is a fixed structure, multiaction, reward-penalty learning automaton, characterized by a star-shaped state transition diagram. Each branch of the star contains D states associated with a particular action. The branches are connected to a central "neutral" state. The most general version of STAR involves probabilistic state transitions in re...

متن کامل

Program Synthesis with Priority Queue Training

2018

Daniel A. Abolafia Mohammad Norouzi Jonathan Shen Rui Zhao Quoc V. Le

We consider the task of program synthesis in the presence of a reward function over the output of programs, where the goal is to find programs with maximal rewards. We introduce an iterative optimization scheme, where we train an RNN on a dataset of K best programs from a priority queue of the generated programs so far. Then, we synthesize new programs and add them to the priority queue by samp...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید