نتایج جستجو برای: q policy
تعداد نتایج: 381585 فیلتر نتایج به سال:
In this study, the effect of a temporary price discount for a larger order than the regular order offered by the supplier on the retailer’s ordering policy is studied. The demand is assumed to be stock-dependent. The units in inventory deteriorate at a constant rate. This study will help the retailer to take the decision whether to adopt a regular or special order policy. The optimum special qu...
this study considers a two echelon supply chain system consisting of one manufacturer and one retailer. under vendor managed inventory (vmi) program, the manufacturer is authorized to manage inventories of agreed upon stock keeping units at retail's locations. in this paper, we assume that the manufacturer monitors inventory levels at retailer's location, replenishes her stock under (...
Value functions can speed the learning of a solution to Markov Decision Problems by providing a prediction of reinforcement against which received reinforcement is compared. Once the learned values relatively reect the optimal ordering of actions, further learning is not necessary. In fact, further learning can lead to the disruption of the optimal policy if the value function is implemented wi...
Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q is computed in a recursive manner by dynamic programming, and then an optimal policy is extr...
Ergodic control for discrete time controlled Markov chains with a locally compact state space and a compact action space is considered under suitable stability, irreducibility and Feller continuity conditions. A flexible family of controls, called action time sharing (ATS) policies, associated with a given continuous stationary Markov control, is introduced. It is shown that the long term avera...
We use simulated soccer to study multiagent learning. Each team's players (agents) share action set and policy but may behave differently due to position-dependent inputs. All agents making up a team are rewarded or punished collectively in case of goals. We conduct simulations with varying team sizes, and compare two learning algorithms: TD-Q learning with linear neural networks (TD-Q) and Pro...
Reinforcement learning has been active research area not only in machine learning but also in control engineering, operation research and robotics in recent years. It is a model free learning control method that can solve Markov decision problems. Q-learning is an incremental dynamic programming procedure that determines the optimal policy in a step-by-step manner. It is an online procedure for...
This paper presents collaboration of behavior based control and fuzzy Q-learning for mobile robot navigation systems. There are many fuzzy Qlearning algorithms that have been proposed to yield individual behavior like obstacle avoidance, find target and so on. However, for complicated tasks, it is needed to combine all behaviors in one control schema using behavior based control. Based this fac...
foreign policy takes root from complicated matters. however, this issue may be more truth about armenia. although the new government of armenia is less than 20 years, people of this territory are the first ones who officially accepted christianity. in very past times, these people were a part of great emperors like iran, rome, and byzantium.armenia is regarded as a nation with a privileged hist...
We consider a multi-item two-echelon inventory system in which the central warehouse operates under a (Q,R) policy, and the local warehouses implement basestock policy. An exact solution procedure is proposed to find the inventory control policy parameters that minimize the system-wide inventory holding and fixed ordering cost subject to an aggregate mean response time constraint at each facility.
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید