نتایج جستجو برای: t policy
تعداد نتایج: 957380 فیلتر نتایج به سال:
We consider M transmitting stations sending packets to a single receiver over a slotted time-multiplexed link. For each phase consisting of T consecutive slots, the receiver dynamically allocates these slots among the M transmitters. Our objective is to characterize policies that minimize the long-term average of the total number of messages awaiting service at the M transmitters. We establish ...
We consider the setting in which an electric power utility seeks to curtail its peak electricity demand by offering a fixed group of customers a uniform price for reductions in consumption relative to their predetermined baselines. The underlying demand curve, which describes the aggregate reduction in consumption in response to the offered price, is assumed to be affine and subject to unobserv...
One of the requirements for the development of societies and organizations is innovation, change in policies and policies, and the use of creative models in decision making. To achieve this goal, organizations must explore opportunities and identify new solutions and possibilities, and apply innovation in decisions. How to identify these opportunities and use them has always been a concern of p...
Projective cone scheduling defines a large class of rate-stabilizing policies for queueing models relevant to several applications. While there exists considerable theory on the properties of projective cone schedulers, there is little practical guidance on choosing the parameters that define them. In this paper, we propose an algorithm for designing an automated projective cone scheduling syst...
Time-Average and Asymptotically Optimal Flow Control Policies in Networks with Multiple Transmitters
We consider M transmitting stations sending packets to a single receiver over a slotted time-multiplexed link. For each phase consisting of T consecutive slots, the receiver dynamically allocates these slots among the M transmitters. Our objective is to characterize policies that minimize the long-term average of the total number of messages awaiting service at the M transmitters. We establish ...
آویشن (Thymus) از تیره Lamiaceae گیاهی است چند ساله که بهدلیل خاصیت دارویی از اهمیت بالایی برخوردار است. بذرهای پنج جمعیت متعلق به چهار گونه T. Lancifolius،T. daenensis subsp. daenensis (دو نمونه)، T. fedtschenkoiو pubescens .Tکشت گردید و پس از جوانهزنی بذرها از مریستم انتهایی ریشه برای مطالعات کاریوتیپی استفاده شد. نتایج نشان داد که تعداد کروموزوم پایه در تمام جمعیتهای مورد بررسی 15= x اس...
This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems. 1 Problem Statement In reinforcement learning, we have an agent which is in a state s and draws actions a from a policy π. Upon an actio...
despite the fact that under paragraph (a) of the article 55 of fourth development plan and the article 21 of fifth development plan of the constitution, the government is required to establish a skill training policy-making institution, efforts has so far been fruitless and the problems due to lack of such national institution still remain to be seen. this paper aims to study the current condit...
The autoregresor fπ(y−1, . . . , y−τ ) is typically selected from a class of autoregressors F . In our experiments, we use regularized linear autoregressors as F . Consider a generic learning policy π̂ with rolled-out trajectory Ŷ = {ŷt}t=1 corresponding to the input sequence X = {xt}t=1. We form the state sequence S = {st}t=1 = {[xt, . . . , xt−τ , ŷt−1, . . . , ŷt−τ ]}t=1. We approximate the s...
In this paper we present the first empirical study of the emphatic temporaldifference learning algorithm (ETD), comparing it with conventional temporaldifference learning, in particular, with linear TD(0), on on-policy and off-policy variations of the Mountain Car problem. The initial motivation for developing ETD was that it has good convergence properties under off -policy training (Sutton, M...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید