reward penalty scheme

نتایج جستجو برای: reward penalty scheme

تعداد نتایج: 265788 فیلتر نتایج به سال:

Intelligent Agent Querying

1987

Tim Doolan Christos Dimitrakakis

Given agents which have learned the transition functions for a discrete MDP, we examen how another agent, with no information about the state space, can efficiently determine how to query the other agents for knowledge of the MDP. This includes determining when to stop querying, as maximize the reward minus the penalty for each query (the reward in the MDP will increased due to more knowledge a...

متن کامل

Inducing Coordination in Supply Chains through Linear Reward Schemes

2005

Boaz Golany Uriel G. Rothblum

Decentralized decision-making in supply chain management is quite common, and often inevitable, due to the magnitude of the chain, its geographical dispersion, and the number of agents that play a role in it. But, decentralized decision-making is known to result in inefficient Nash equilibrium outcomes, and optimal outcomes that maximize the sum of the utilities of all agents need not be Nash e...

متن کامل

Secure Rewarding Schemes

1997

Chun-I Fan Chin-Laung Lei

In this paper, we propose a secure rewarding scheme. I n the scheme, a reward provider publishes a problem, and provides a reward f o r a person who can supply h im a satisfactory solution of the problem. The first qualified claimant with satisfactory solution of the problem is selected to obtain the reward. The selected claimant can obtain the reward from the reward provider without revealing ...

متن کامل

Harnessing Anomalous Preferences of Anonymous Users for Lean Information Systems Development

2013

Adarsh Kumar Kakar

When features are added to an existing Information Systems (IS) product in response to market demands it is important to assess their business value before implementing them into the product. But how does one estimate the true value of a new feature? Is it sufficient to consider only the consumer reward for including a feature into the product or is it also useful to evaluate the consumer penal...

متن کامل

Fitness Function for Genetic Algorithm used in Intrusion Detection System

2012

Firas Alabsi Reyadh Naoum

Computer network usage increased rapidly at the last decades, the intruders tried to satisfy their needs by many types of attack depending on the intruder objectives, this encourage the researchers to find more and more solutions to detect those attacks. Intrusion Detection System used to detect the attack. Genetic Algorithm used to support IDS. Fitness Function is helpful in chromosome evaluat...

متن کامل

Approximation Schemes for Multiperiod Binary Knapsack Problems

Journal: :Lecture Notes in Computer Science 2021

An instance of the multiperiod binary knapsack problem (MPBKP) is given by a horizon length T, non-decreasing vector sizes $$(c_1, \ldots , c_T)$$ where $$c_t$$ denotes cumulative size for periods $$1,\ldots ,t$$ and list n items. Each item triple (r, q, d) r reward or value item, q its size, d time index (or, deadline). The goal to choose, each deadline t, which items include maximize total re...

متن کامل

Hotel Service Analysis by Penalty-Reward Contrast Technique for Online Review Data

Journal: :Sustainability 2022

Hotel reviews play an important role in the selection of hotels by travelers. Online travel platforms (e.g., Tripadvisor, Expedia) provide multi-criteria room, service, location, sleep quality, etc.) ratings to make it easier for travelers choose a hotel from reviews. Through penalty-reward contrast analysis (PRCA), this study aims explore asymmetric effects attribute performance (Value, Cleanl...

متن کامل

Reward-Penalty Mechanism or Subsidy Mechanism: A Closed-Loop Supply Chain Perspective

Journal: :Mathematics 2022

The government plays a crucial role in regulating the closed-loop supply chain (CLSC). We investigated reward-penalty mechanism (RPM) for manufacturer and subsidy (SM) collector CLSCs. government’s goal is to maximize social welfare. Based on centralized decentralized decision-making models without intervention, we developed two CLSC where rewards or penalizes subsidizes collector. Then, impact...

متن کامل

Representations of Decision-Theoretic Planning Tasks

2000

Sven Koenig Yaxin Liu

Goal-directed Markov Decision Process models (GDMDPs) are good models for many decision-theoretic planning tasks. They have been used in conjunction with two different reward structures, namely the goal-reward representation and the action-penalty representation. We apply GDMDPs to planning tasks in the presence of traps such as steep slopes for outdoor robots or staircases for indoor robots, a...

متن کامل

The distribution of rewards in sensorimotor maps acquired by cognitive robots through exploration

Journal: :Neurocomputing 2011

Vishwanathan Mohan Pietro G. Morasso Giorgio Metta Stathis Kasderidis

To exhibit intelligent behavior, cognitive robots must have some knowledge about the consequences of their actions and their value in the context of the goal being realized. We present a neural framework using which explorative sensorimotor experiences of cognitive robots can be efficiently ‘internalized’ using growing sensorimotor maps and planning realized using goal induced quasi-stationary ...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید