نتایج جستجو برای: t policy

تعداد نتایج: 957380  

2005

We consider M transmitting stations sending packets to a single receiver over a slotted time-multiplexed link. For each phase consisting of T consecutive slots, the receiver dynamically allocates these slots among the M transmitters. Our objective is to characterize policies that minimize the long-term average of the total number of messages awaiting service at the M transmitters. We establish ...

Journal: :CoRR 2016
Kia Khezeli Eilyan Bitar

We consider the setting in which an electric power utility seeks to curtail its peak electricity demand by offering a fixed group of customers a uniform price for reductions in consumption relative to their predetermined baselines. The underlying demand curve, which describes the aggregate reduction in consumption in response to the offered price, is assumed to be affine and subject to unobserv...

One of the requirements for the development of societies and organizations is innovation, change in policies and policies, and the use of creative models in decision making. To achieve this goal, organizations must explore opportunities and identify new solutions and possibilities, and apply innovation in decisions. How to identify these opportunities and use them has always been a concern of p...

Journal: :CoRR 2018
Neal Master

Projective cone scheduling defines a large class of rate-stabilizing policies for queueing models relevant to several applications. While there exists considerable theory on the properties of projective cone schedulers, there is little practical guidance on choosing the parameters that define them. In this paper, we propose an algorithm for designing an automated projective cone scheduling syst...

1991
Redha M. Bournas Frederick J. Beutler Demosthenis Teneketzis

We consider M transmitting stations sending packets to a single receiver over a slotted time-multiplexed link. For each phase consisting of T consecutive slots, the receiver dynamically allocates these slots among the M transmitters. Our objective is to characterize policies that minimize the long-term average of the total number of messages awaiting service at the M transmitters. We establish ...

زهرا دفتری عبّاس صفرنژاد,

آویشن (Thymus) از تیره Lamiaceae گیاهی است چند ساله که به‌دلیل خاصیت دارویی از اهمیت بالایی برخوردار است. بذرهای پنج جمعیت متعلق به چهار گونه T. Lancifolius،T. daenensis subsp. daenensis  (دو نمونه)، T. fedtschenkoiو pubescens .Tکشت گردید و پس از جوانه‌زنی بذرها از مریستم انتهایی ریشه برای مطالعات کاریوتیپی استفاده شد. نتایج نشان داد که تعداد کروموزوم پایه در تمام جمعیت‌های مورد بررسی 15= x اس...

2010
Jan Peters Katharina Mülling Yasemin Altun

This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems. 1 Problem Statement In reinforcement learning, we have an agent which is in a state s and draws actions a from a policy π. Upon an actio...

Journal: :مدیریت فرهنگ سازمانی 0
احمد عطارنیا دانشجوی دکتری مدیریت خط مشی گذاری، پردیس فارابی، دانشگاه تهران حسین خنیفر استاد گروه مدیریت، دانشکده مدیریت و حسابداری، پردیس فارابی، دانشگاه تهران محمدحسین رحمتی استادیار گروه مدیریت، دانشکده مدیریت و حسابداری، پردیس فارابی، دانشگاه تهران غلامرضا جندقی استاد گروه مدیریت، دانشکده مدیریت و حسابداری، پردیس فارابی، دانشگاه تهران

despite the fact that under paragraph (a) of the article 55 of fourth development plan and the article 21 of fifth development plan of the constitution, the government is required to establish a skill training policy-making institution, efforts has so far been fruitless and the problems due to lack of such national institution still remain to be seen. this paper aims to study the current condit...

2016
Jianhui Chen Hoang M. Le Peter Carr Yisong Yue James J. Little

The autoregresor fπ(y−1, . . . , y−τ ) is typically selected from a class of autoregressors F . In our experiments, we use regularized linear autoregressors as F . Consider a generic learning policy π̂ with rolled-out trajectory Ŷ = {ŷt}t=1 corresponding to the input sequence X = {xt}t=1. We form the state sequence S = {st}t=1 = {[xt, . . . , xt−τ , ŷt−1, . . . , ŷt−τ ]}t=1. We approximate the s...

Journal: :CoRR 2017
Sina Ghiassian Banafsheh Rafiee Richard S. Sutton

In this paper we present the first empirical study of the emphatic temporaldifference learning algorithm (ETD), comparing it with conventional temporaldifference learning, in particular, with linear TD(0), on on-policy and off-policy variations of the Mountain Car problem. The initial motivation for developing ETD was that it has good convergence properties under off -policy training (Sutton, M...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید