reward penalty scheme

نتایج جستجو برای: reward penalty scheme

تعداد نتایج: 265788 فیلتر نتایج به سال:

Penalty and reward contracts between a manufacturer and its logistics service provider

Journal: :Logistics Research 2016

A Lyapunov-Based Methodology for Constrained Optimization with Bandit Feedback

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2022

In a wide variety of applications including online advertising, contractual hiring, and wireless scheduling, the controller is constrained by stringent budget constraint on available resources, which are consumed in random amount each action, stochastic feasibility that may impose important operational limitations decision-making. this work, we consider general model to address such problems, w...

متن کامل

Reward Design for Multi-Agent Reinforcement Learning with a Penalty Based on the Payment Mechanism

Journal: :Transactions of The Japanese Society for Artificial Intelligence 2021

In this paper, we propose a novel method of reward design for multi-agent reinforcement learning (MARL). One the main uses MARL is building cooperative policies between self-interested agents. We take inspiration from concept mechanism game theory to modify how agents are rewarded in algorithms. defined payment that reflects negative contribution other agents’ valuation same manner as Vickrey-C...

متن کامل

Hybrid evolutionary optimization for takeaway order selection and delivery path planning utilizing habit data

Journal: :Complex & Intelligent Systems 2021

Abstract The last years have seen a rapid growth of the takeaway delivery market, which has provided lot jobs for deliverymen. However, increasing numbers orders and corresponding pickup service points made order selection path planning key challenging problem to In this paper, we present integrating deliverymen, objective is maximize revenue per unit time subject maximum length, overdue penalt...

متن کامل

A Navigation and Obstacle Avoidance Algorithm for Mobile Robots Operating in Unknown, Maze-Type Environments

2006

R. Clark A. El-Osery K. Wedeward

This paper describes two complementary algorithms developed for mobile robots operating within unknown, maze-type environments. The first is an environmental mapping and navigation algorithm which ensures complete coverage of a maze with apriori unknown wall locations, and the second a stochastic learning automaton approach for general obstacle avoidance within the maze. The environmental mappi...

متن کامل

On Using Stochastic Automata for Trajectory Planning of Robot Manipulators in Noisy Workspaces

2004

B. J. Oommen S. Sitharam Iyengar Nicte Andrade

We consider the problem of a robot manipulator operating in a noisy workspace. The robot is assigned the task of moving from Pi to Pf. Since Pi is its initial position, this position can be known fairly accurately. However, since Pf is usually obtained as a result of a sensing operation, possibly vision sensing, we assume that Pf i s noisy. We propose a solution to achieve the motion which invo...

متن کامل

Towards a Theoretic Understanding of DCEE

2010

Scott Alfeld Matthew E. Taylor Prateek Tandon Milind Tambe

Common wisdom says that the greater the level of teamwork, the higher the performance of the team. In teams of cooperative autonomous agents, working together rather than independently can increase the team reward. However, recent results show that in uncertain environments, increasing the level of teamwork can actually decrease overall performance. Coined the team uncertainty penalty, this phe...

متن کامل

Link Monotonic Allocation Schemes

Journal: :IGTR 2005

Marco Slikker

A network is a graph where the nodes represent players and the links represent bilateral interaction between the players. A reward game assigns a value to every network on a fixed set of players. An allocation scheme specifies how to distribute the worth of every network among the players. This allocation scheme is link monotonic if extending the network does not decrease the payoff of any play...

متن کامل

Coordination Game Analysis through Penalty Scheme in Freight Intermodal Service

Journal: :Mathematical Problems in Engineering 2012

متن کامل

Dual Formulation of Controlled Markov Diffï¿1⁄2usions and Its Application

2014

Fan Ye Enlu Zhou

Information relaxation and duality in Markov decision processes have been studied recently to derive upper bounds on the maximal expected reward (or lower bounds on the minimal expected cost). The idea is to relax the non-anticipativity constraint on the controls and impose a penalty to punish such a violation. In this paper we generalize this dual approach to controlled Markov diffusions. We d...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید