نتایج جستجو برای: q policy

تعداد نتایج: 381585  

2012
Nita H. Shah

In this study, the effect of a temporary price discount for a larger order than the regular order offered by the supplier on the retailer’s ordering policy is studied. The demand is assumed to be stock-dependent. The units in inventory deteriorate at a constant rate. This study will help the retailer to take the decision whether to adopt a regular or special order policy. The optimum special qu...

Journal: :مدیریت صنعتی 0
رسول حجی دانشگاه صنعتی شریف محمد محسن معارف دوست دانشگاه صنعتی شریف سید بابک ابراهیمی دانشگاه صنعتی شریف

this study considers a two echelon supply chain system consisting of one manufacturer and one retailer. under vendor managed inventory (vmi) program, the manufacturer is authorized to manage inventories of agreed upon stock keeping units at retail's locations. in this paper, we assume that the manufacturer monitors inventory levels at retailer's location, replenishes her stock under (...

2000
Charles W. Anderson

Value functions can speed the learning of a solution to Markov Decision Problems by providing a prediction of reinforcement against which received reinforcement is compared. Once the learned values relatively reect the optimal ordering of actions, further learning is not necessary. In fact, further learning can lead to the disruption of the optimal policy if the value function is implemented wi...

Journal: :J. Artif. Intell. Res. 2008
Frans A. Oliehoek Matthijs T. J. Spaan Nikos A. Vlassis

Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q is computed in a recursive manner by dynamic programming, and then an optimal policy is extr...

Journal: :SIAM J. Control and Optimization 2012
Amarjit Budhiraja Xin Liu Adam Shwartz

Ergodic control for discrete time controlled Markov chains with a locally compact state space and a compact action space is considered under suitable stability, irreducibility and Feller continuity conditions. A flexible family of controls, called action time sharing (ATS) policies, associated with a given continuous stationary Markov control, is introduced. It is shown that the long term avera...

1997
Marco Wiering A. Germond M. Hasler

We use simulated soccer to study multiagent learning. Each team's players (agents) share action set and policy but may behave differently due to position-dependent inputs. All agents making up a team are rewarded or punished collectively in case of goals. We conduct simulations with varying team sizes, and compare two learning algorithms: TD-Q learning with linear neural networks (TD-Q) and Pro...

2013
Sudhir Raj Cheruvu Siva Kumar

Reinforcement learning has been active research area not only in machine learning but also in control engineering, operation research and robotics in recent years. It is a model free learning control method that can solve Markov decision problems. Q-learning is an incremental dynamic programming procedure that determines the optimal policy in a step-by-step manner. It is an online procedure for...

2013
Khairul Anam Son Kuswadi

This paper presents collaboration of behavior based control and fuzzy Q-learning for mobile robot navigation systems. There are many fuzzy Qlearning algorithms that have been proposed to yield individual behavior like obstacle avoidance, find target and so on. However, for complicated tasks, it is needed to combine all behaviors in one control schema using behavior based control. Based this fac...

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه علامه طباطبایی 1388

foreign policy takes root from complicated matters. however, this issue may be more truth about armenia. although the new government of armenia is less than 20 years, people of this territory are the first ones who officially accepted christianity. in very past times, these people were a part of great emperors like iran, rome, and byzantium.armenia is regarded as a nation with a privileged hist...

Journal: :Oper. Res. Lett. 2010
Engin Topan Z. Pelin Bayindir Tarkan Tan

We consider a multi-item two-echelon inventory system in which the central warehouse operates under a (Q,R) policy, and the local warehouses implement basestock policy. An exact solution procedure is proposed to find the inventory control policy parameters that minimize the system-wide inventory holding and fixed ordering cost subject to an aggregate mean response time constraint at each facility.

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید