نتایج جستجو برای: reward penalty scheme

تعداد نتایج: 265788  

1987
Tim Doolan Christos Dimitrakakis

Given agents which have learned the transition functions for a discrete MDP, we examen how another agent, with no information about the state space, can efficiently determine how to query the other agents for knowledge of the MDP. This includes determining when to stop querying, as maximize the reward minus the penalty for each query (the reward in the MDP will increased due to more knowledge a...

2005
Boaz Golany Uriel G. Rothblum

Decentralized decision-making in supply chain management is quite common, and often inevitable, due to the magnitude of the chain, its geographical dispersion, and the number of agents that play a role in it. But, decentralized decision-making is known to result in inefficient Nash equilibrium outcomes, and optimal outcomes that maximize the sum of the utilities of all agents need not be Nash e...

1997
Chun-I Fan Chin-Laung Lei

In this paper, we propose a secure rewarding scheme. I n the scheme, a reward provider publishes a problem, and provides a reward f o r a person who can supply h im a satisfactory solution of the problem. The first qualified claimant with satisfactory solution of the problem is selected to obtain the reward. The selected claimant can obtain the reward from the reward provider without revealing ...

2013
Adarsh Kumar Kakar

When features are added to an existing Information Systems (IS) product in response to market demands it is important to assess their business value before implementing them into the product. But how does one estimate the true value of a new feature? Is it sufficient to consider only the consumer reward for including a feature into the product or is it also useful to evaluate the consumer penal...

2012
Firas Alabsi Reyadh Naoum

Computer network usage increased rapidly at the last decades, the intruders tried to satisfy their needs by many types of attack depending on the intruder objectives, this encourage the researchers to find more and more solutions to detect those attacks. Intrusion Detection System used to detect the attack. Genetic Algorithm used to support IDS. Fitness Function is helpful in chromosome evaluat...

Journal: :Lecture Notes in Computer Science 2021

An instance of the multiperiod binary knapsack problem (MPBKP) is given by a horizon length T, non-decreasing vector sizes $$(c_1, \ldots , c_T)$$ where $$c_t$$ denotes cumulative size for periods $$1,\ldots ,t$$ and list n items. Each item triple (r, q, d) r reward or value item, q its size, d time index (or, deadline). The goal to choose, each deadline t, which items include maximize total re...

Journal: :Sustainability 2022

Hotel reviews play an important role in the selection of hotels by travelers. Online travel platforms (e.g., Tripadvisor, Expedia) provide multi-criteria room, service, location, sleep quality, etc.) ratings to make it easier for travelers choose a hotel from reviews. Through penalty-reward contrast analysis (PRCA), this study aims explore asymmetric effects attribute performance (Value, Cleanl...

Journal: :Mathematics 2022

The government plays a crucial role in regulating the closed-loop supply chain (CLSC). We investigated reward-penalty mechanism (RPM) for manufacturer and subsidy (SM) collector CLSCs. government’s goal is to maximize social welfare. Based on centralized decentralized decision-making models without intervention, we developed two CLSC where rewards or penalizes subsidizes collector. Then, impact...

2000
Sven Koenig Yaxin Liu

Goal-directed Markov Decision Process models (GDMDPs) are good models for many decision-theoretic planning tasks. They have been used in conjunction with two different reward structures, namely the goal-reward representation and the action-penalty representation. We apply GDMDPs to planning tasks in the presence of traps such as steep slopes for outdoor robots or staircases for indoor robots, a...

Journal: :Neurocomputing 2011
Vishwanathan Mohan Pietro G. Morasso Giorgio Metta Stathis Kasderidis

To exhibit intelligent behavior, cognitive robots must have some knowledge about the consequences of their actions and their value in the context of the goal being realized. We present a neural framework using which explorative sensorimotor experiences of cognitive robots can be efficiently ‘internalized’ using growing sensorimotor maps and planning realized using goal induced quasi-stationary ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید