نتایج جستجو برای: وزن دهی critic

تعداد نتایج: 69016  

2016
Ngo Anh Vien Peter Englert Marc Toussaint

Modeling policies in reproducing kernel Hilbert space (RKHS) renders policy gradient reinforcement learning algorithms non-parametric. As a result, the policies become very flexible and have a rich representational potential without a predefined set of features. However, their performances might be either non-covariant under reparameterization of the chosen kernel, or very sensitive to step-siz...

2016
Huitian Lei

An Online Actor Critic Algorithm and a Statistical Decision Procedure for Personalizing Intervention by Huitian Lei Chair: Professor Susan A. Murphy Assistant Professor Ambuj Tewari Increasing technological sophistication and widespread use of smartphones and wearable devices provide opportunities for innovative health interventions. An Adaptive Intervention (AI) personalizes the type, mode and...

ژورنال: :فصل نامه علمی پژوهشی مهندسی پزشکی زیستی 2009
بابک محمدزاده اصل علی محلوجی فر

در سال های اخیر روش های شکل دهی پرتو وفقی به منظور افزایش کیفیت تصاویر اولتراسوند به کار گرفته شده اند. این روش ها به دلیل استفاده از اطلاعات محیط و به روز کردن وزن های اعمالی به اجزای آرایه به صورت لحظه به لحظه، موفقیت زیادی در بهبود قدرت تفکیک تصاویر اولتراسوند داشته اند. ولی این افزایش قدرت تفکیک به بهای کاهش کنتراست تصاویر اولتراسوند نسبت به روش های غیروفقی به دست می آید. در این مقاله روشی ...

2002
Matti Aksela Ramunas Girdziusas Jorma Laaksonen Erkki Oja Jari Kangas

This paper discusses a combination of two techniques for improving the recognition accuracy of on-line handwritten character recognition: committee classification and adaptation to the user. A novel adaptive committee structure, namely the Class-Confidence Critic Combination (CCCC) scheme, is presented and evaluated. It is shown to be able to improve significantly on its member classifiers. Als...

1987
Gerhard Fischer

Our goal is to establish the conceptual foundations for using the computational power that is or will be available on computer systems. Much of the available computing power is wasted, however, if users have difficulty understanding and llsing the full potential of these systems. Too much attention in the past has been given to the technology of computer systems and not enough to the effects of...

Journal: :CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne 2009
Julie Strong

Journal: :CoRR 2018
Mikolaj Binkowski Dougal J. Sutherland Michael Arbel Arthur Gretton

We investigate the training and performance of generative adversarial networks using the Maximum Mean Discrepancy (MMD) as critic, termed MMD GANs. As our main theoretical contribution, we clarify the situation with bias in GAN loss functions raised by recent work: we show that gradient estimators used in the optimization process for both MMD GANs and Wasserstein GANs are unbiased, but learning...

Journal: :CoRR 2018
Xiaoqin Zhang Huimin Ma

Pretraining with expert demonstrations have been found useful in speeding up the training process of deep reinforcement learning algorithms since less online simulation data is required. Some people use supervised learning to speed up the process of feature learning, others pretrain the policies by imitating expert demonstrations. However, these methods are unstable and not suitable for actor-c...

2012
M. Sedighizadeh A. Rezazadeh

A self tuning PID control strategy using reinforcement learning is proposed in this paper to deal with the control of wind energy conversion systems (WECS). Actor-Critic learning is used to tune PID parameters in an adaptive way by taking advantage of the model-free and on-line learning properties of reinforcement learning effectively. In order to reduce the demand of storage space and to impro...

Journal: :Adaptive Behaviour 2005
Mehdi Khamassi Loïc Lachèze Benoît Girard Alain Berthoz Agnès Guillot

Since 1995, numerous Actor-Critic architectures for reinforcement learning have been proposed as models of dopamine-like reinforcement learning mechanisms in the rat’s basal ganglia. However, these models were usually tested in different tasks, and it is then difficult to compare their efficiency for an autonomous animat. We present here the comparison of four architectures in an animat as it p...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید